Truncating a double floating point at a certain number of digits

14.9k views Asked by At

I have written the following routine, which is supposed to truncate a C++ double at the n'th decimal place.

double truncate(double number_val, int n)
{
    double factor = 1;
    double previous = std::trunc(number_val); // remove integer portion
    number_val -= previous;
    for (int i = 0; i < n; i++) {
        number_val *= 10;
        factor *= 10;
    }
    number_val = std::trunc(number_val);
    number_val /= factor;
    number_val += previous; // add back integer portion
    return number_val;
}

Usually, this works great... but I have found that with some numbers, most notably those that do not seem to have an exact representation within double, have issues.

For example, if the input is 2.0029, and I want to truncate it at the fifth place, internally, the double appears to be stored as something somewhere between 2.0028999999999999996 and 2.0028999999999999999, and truncating this at the fifth decimal place gives 2.00289, which might be right in terms of how the number is being stored, but is going to look like the wrong answer to an end user.

If I were rounding instead of truncating at the fifth decimal, everything would be fine, of course, and if I give a double whose decimal representation has more than n digits past the decimal point it works fine as well, but how do I modify this truncation routine so that inaccuracies due to imprecision in the double type and its decimal representation will not affect the result that the end user sees?

I think I may need some sort of rounding/truncation hybrid to make this work, but I'm not sure how I would write it.

Edit: thanks for the responses so far but perhaps I should clarify that this value is not producing output necessarily but this truncation operation can be part of a chain of many different user specified actions on floating point numbers. Errors that accumulate within the double precision over multiple operations are fine, but no single operation, such as truncation or rounding, should produce a result that differs from its actual ideal value by more than half of an epsilon, where epsilon is the smallest magnitude represented by the double precision with the current exponent. I am currently trying to digest the link provided by iinspectable below on floating point arithmetic to see if it will help me figure out how to do this.

Edit: well the link gave me one idea, which is sort of hacky but it should probably work which is to put a line like number_val += std::numeric_limits<double>::epsilon() right at the top of the function before I start doing anything else with it. Dunno if there is a better way, though.

Edit: I had an idea while I was on the bus today, which I haven't had a chance to thoroughly test yet, but it works by rounding the original number to 16 significant decimal digits, and then truncating that:

double truncate(double number_val, int n)
{
    bool negative = false;
    if (number_val == 0) {
        return 0;
    } else if (number_val < 0) {
        number_val = -number_val;
        negative = true;
    } 
    int pre_digits = std::log10(number_val) + 1;
    if (pre_digits < 17) {
        int post_digits = 17 - pre_digits;
        double factor = std::pow(10, post_digits);
        number_val = std::round(number_val * factor) / factor;
        factor = std::pow(10, n);
        number_val = std::trunc(number_val * factor) / factor;
    } else {
        number_val = std::round(number_val);
    }
    if (negative) {
        number_val = -number_val;
    }
    return number_val;
}

Since a double precision floating point number only can have about 16 digits of precision anyways, this just might work for all practical purposes, at a cost of at most only one digit of precision that the double would otherwise perhaps support.

I would like to further note that this question differs from the suggested duplicate above in that a) this is using C++, and not Java... I don't have a DecimalFormatter convenience class, and b) I am wanting to truncate, not round, the number at the given digit (within the precision limits otherwise allowed by the double datatype), and c) as I have stated before, the result of this function is not supposed to be a printable string... it is supposed to be a native floating point number that the end user of this function might choose to further manipulate. Accumulated errors over multiple operations due to imprecision in the double type are acceptable, but any single operation should appear to perform correctly to the limits of the precision of the double datatype.

3

There are 3 answers

4
Malcolm McLean On

I've looked into this. It's hard because you have inaccuracies due to the floating point representation, then further inaccuracies due to the decimal. 0.1 cannot be represented exactly in binary floating point. However you can use the built-in function sprintf with a %g argument that should round accurately for you.

 char out[64];
 double x = 0.11111111;
 int n = 3;
 double xrounded;
 sprintf(out, "%.*g", n, x);
 xrounded = strtod(out, 0);
0
Jack Deeth On

OK, if I understand this right, you've got a floating point number and you want to truncate it to n digits:

10.099999
   ^^      n = 2

becomes

10.09
   ^^

But your function is truncating the number to an approximately close value:

10.08999999
   ^^

Which is then displayed as 10.08?

How about you keep your truncate formula, which does truncate as well as it can, and use std::setprecision and std::fixed to round the truncated value to the required number of decimal places? (Assuming it is std::cout you're using for output?)

#include <iostream>
#include <iomanip>

using std::cout;
using std::setprecision;
using std::fixed;
using std::endl;

int main() {
  double foo = 10.08995; // let's imagine this is the output of `truncate`

  cout << foo << endl;                             // displays 10.0899
  cout << setprecision(2) << fixed << foo << endl; // rounds to 10.09
}

I've set up a demo on wandbox for this.

0
Antonin GAVREL On

Get double as a string

If you are looking just to print the output, then it is very easy and straightforward using stringstream:

#include <cmath>
#include <iostream>
#include <iomanip>
#include <limits>
#include <sstream>

using namespace std;

string truncateAsString(double n, int precision) {
    stringstream ss;
    double remainder = static_cast<double>((int)floor((n - floor(n)) * precision) % precision);
    ss << setprecision(numeric_limits<double> ::max_digits10 + __builtin_ctz(precision))<< floor(n);
    if (remainder)
        ss << "." << remainder;
    cout << ss.str() << endl;
    return ss.str();
}

int main(void) {
    double a = 9636346.59235;
    int precision = 1000; // as many digits as you add zeroes. 3 zeroes means precision of 3.
    string s = truncateAsString(a, precision);
    return 0;
}

Getting the divided floating point with an exact value

Maybe you are looking for true value for your floating point, you can use boost multiprecision library

The Boost.Multiprecision library can be used for computations requiring precision exceeding that of standard built-in types such as float, double and long double. For extended-precision calculations, Boost.Multiprecision supplies a template data type called cpp_dec_float. The number of decimal digits of precision is fixed at compile-time via template parameter.

Demonstration

#include <boost/math/constants/constants.hpp>
#include <boost/multiprecision/cpp_dec_float.hpp>
#include <iostream>
#include <limits>
#include <cmath>
#include <iomanip>

using boost::multiprecision::cpp_dec_float_50;

cpp_dec_float_50 truncate(cpp_dec_float_50 n, int precision) {
    cpp_dec_float_50 remainder = static_cast<cpp_dec_float_50>((int)floor((n - floor(n)) * precision) % precision) / static_cast<cpp_dec_float_50>(precision);
    return floor(n) + remainder;
}

int main(void) {
    int precision = 100000; // as many digits as you add zeroes. 5 zeroes means precision of 5.
    cpp_dec_float_50 n = 9636346.59235789;
    n = truncate(n, precision); // first part is remainder, floor(n) is int value truncated.
    cout << setprecision(numeric_limits<cpp_dec_float_50> ::max_digits10 + __builtin_ctz(precision)) << n << endl; // __builtin_ctz(precision) will equal the number of trailing 0, exactly the precision we need!
    return 0;
}

Output:

9636346.59235

NB: Requires sudo apt-get install libboost-all-dev