The most accurate way to calculate numerator and denominator of a double

1.8k views Asked by At

I have implemented class NaturalNum for representing a natural number of "infinite" size (up to 4GB).

I have also implemented class RationalNum for representing a rational number with infinite accuracy. It stores the numerator and the denominator of the rational number, both of which are NaturalNum instances, and relies on them when performing any arithmetic operation issued by the user.

The only place where precision is "dropped by a certain degree", is upon printing, since there's a limit (provided by the user) to the number of digits that appear after the decimal (or non-decimal) point.

My question concerns one of the constructors of class RationalNum. Namely, the constructor that takes a double value, and computes the corresponding numerator and denominator.

My code is given below, and I would like to know if anyone sees a more accurate way for computing them:

RationalNum::RationalNum(double value)
{
    if (value == value+1)
        throw "Infinite Value";
    if (value != value)
        throw "Undefined Value";

    m_sign        = false;
    m_numerator   = 0;
    m_denominator = 1;

    if (value < 0)
    {
        m_sign = true;
        value  = -value;
    }

    // Here is the actual computation
    while (value > 0)
    {
        unsigned int floor = (unsigned int)value;
        value         -= floor;
        m_numerator   += floor;
        value         *= 2;
        m_numerator   *= 2;
        m_denominator *= 2;
    }

    NaturalNum gcd = GCD(m_numerator,m_denominator);
    m_numerator   /= gcd;
    m_denominator /= gcd;
}

Note: variables starting with 'm_' are member variables.

Thanks

2

There are 2 answers

5
Jan Hudec On

The standard library contains a function for obtaining the significand and exponent, frexp.

Just multiply the significand to get all bits before decimal point and set appropriate denominator. Just don't forget the significand is normalized to be between 0.5 and 1 (I would consider between 1 and 2 more natural but whatever) and that it has 53 significant bits for IEEE double (there are no practically used platforms that would use different floating point format).

12
vmrob On

I'm not 100% confident in the math that you have for the actual computation only because I haven't really examined it, but I think the below method removes the need to use the GCD function which could bring in some unnecessary running time.

Here is the class I came up with. I haven't fully tested it, but I produced a couple billion random doubles and the asserts never fired, so I'm reasonably confident in its usability, but I would still test the edge cases around INT64_MAX a little more.

If I'm not mistaken, the running time complexity of this algorithm is linear with respect to the size in bits of the input.

#include <iostream>
#include <cmath>
#include <cassert>
#include <limits>

class Real;

namespace std {
    inline bool isnan(const Real& r);
    inline bool isinf(const Real& r);
}

class Real {
public:
    Real(double val)
        : _val(val)
    {
        if (std::isnan(val)) { return; }
        if (std::isinf(val)) { return; }

        double d;

        if (modf(val, &d) == 0) {
            // already a whole number
            _num = val;
            _den = 1.0;
            return;
        }

        int exponent;
        double significand = frexp(val, &exponent); // val = significand * 2^exponent
        double numerator = val;
        double denominator = 1;

        // 0.5 <= significand < 1.0
        // significand is a fraction, multiply it by two until it's a whole number
        // subtract exponent appropriately to maintain val = significand * 2^exponent
        do {
            significand *= 2;
            --exponent;
            assert(std::ldexp(significand, exponent) == val);
        } while (modf(significand, &d) != 0);

        assert(exponent <= 0);  

        // significand is now a whole number
        _num = significand;
        _den = 1.0 / std::ldexp(1.0, exponent);

        assert(_val == _num / _den);
    }

    friend std::ostream& operator<<(std::ostream &os, const Real& rhs);
    friend bool std::isnan(const Real& r);
    friend bool std::isinf(const Real& r);

private:
    double _val = 0;
    double _num = 0;
    double _den = 0;
};

std::ostream& operator<<(std::ostream &os, const Real& rhs) {
    if (std::isnan(rhs) || std::isinf(rhs)) {
        return os << rhs._val;
    }
    if (rhs._den == 1.0) {
        return os << rhs._num;
    }
    return os << rhs._num << " / " << rhs._den;
}

namespace std {
    inline bool isnan(const Real& r) { return std::isnan(r._val); }
    inline bool isinf(const Real& r) { return std::isinf(r._val); }
}

#include <iomanip>

int main () {

    #define PRINT_REAL(num) \
        std::cout << std::setprecision(100) << #num << " = " << num << " = " << Real(num) << std::endl

    PRINT_REAL(1.5);
    PRINT_REAL(123.875);
    PRINT_REAL(0.125);

    // double precision issues
    PRINT_REAL(-10000000000000023.219238745);
    PRINT_REAL(-100000000000000000000000000000000000000000.5);

    return 0;
}

Upon looking at your code a little bit more, there's at least a problem with your testing for infinite values. Note the following program:

#include <numeric>
#include <cassert>
#include <cmath>

int main() {
    {
        double d = std::numeric_limits<double>::max(); // about 1.7976931348623e+308

        assert(!std::isnan(d));
        assert(!std::isinf(d));

        // assert(d != d + 1); // fires
    }

    {
        double d = std::ldexp(1.0, 500); // 2 ^ 700
        assert(!std::isnan(d));
        assert(!std::isinf(d));

        // assert(d != d + 1); // fires
    }
}

In addition to that, if your GCD function doesn't support doubles, then you'll be limiting yourself in terms of values you can import as doubles. Try any number > INT64_MAX and the GCD function may not work.