boost::multiprecision::cpp_int is being copied then deleted every time I try and print it

206 views Asked by At

When I print a cpp_int from Boost, it seems that the entire object is copied.

#include <iostream>
#include <boost/multiprecision/cpp_int.hpp>
using std::cout;

void* operator new(size_t size) {
    void *memory = malloc(size);
    cout << "New: " << memory << " " << size << "\n";
    return memory;
}

int main() {
    auto u = new boost::multiprecision::cpp_int("987654321");
    cout << "------\n";
    cout << *u << "\n";
}
New: 0x23d4e70 32
------
New: 0x23d52b0 31
987654321

The confusing thing is that the overload for printing is ostream& operator<<(ostream&, const T&), but passing *u into a function such as template <typename T> void cr(const T&) {} does not show any new memory allocation. I also tried u->str() but this also causes a 2nd memory allocation.

I also tried overloading the cout for the cpp_int:

std::ostream& operator <<(std::ostream& stream, const boost::multiprecision::cpp_int& mpi) {
    return stream << mpi.str();
}

but the result was the same. However, I am also surprised that this compiled since I had expected there to already be an overload. My assumption is that I may need to modify something more back-end.

How can I avoid this? I don't want to be copying and then deleting 30+ bytes every time I want to print a cpp_int.

If not, switching data type it not out of the question, as long as the interface is similar for minimal refactoring.

1

There are 1 answers

2
sehe On BEST ANSWER

The way you mismatched malloc/new is invoking UB (as ubsan+asan will readily tell you).

==32752==ERROR: AddressSanitizer: alloc-dealloc-mismatch (malloc vs operator delete
    #0 0x7fb58c15c407 in operator delete(void*, unsigned long) (/usr/lib/x86_64-lin
    #1 0x564b19759014 in __gnu_cxx::new_allocator<char>::deallocate(char*, unsigned
    #2 0x564b1974b8cb in std::allocator<char>::deallocate(char*, unsigned long) /us
    #3 0x564b1974b8cb in std::allocator_traits<std::allocator<char> >::deallocate(s
    #4 0x564b197478f4 in std::__cxx11::basic_string<char, std::char_traits<char>, s
    #5 0x564b19744f74 in std::__cxx11::basic_string<char, std::char_traits<char>, s
    #6 0x564b19741053 in std::__cxx11::basic_string<char, std::char_traits<char>, s
    #7 0x564b19744993 in std::ostream& boost::multiprecision::operator<< <boost::mu
    #8 0x564b1973da5a in main /home/sehe/Projects/stackoverflow/test.cpp:14
    #9 0x7fb58ab85bf6 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21bf6
    #10 0x564b1973d669 in _start (/home/sehe/Projects/stackoverflow/sotest+0x4a669)

So let's focus on the claim:

#include <boost/multiprecision/cpp_int.hpp>
#include <iostream>

using Int = boost::multiprecision::cpp_int;

int main() {
    Int u("987654321");
    std::cout << u << "\n";
}

When we ask clang to follow the overload for operator<< it leads us here:

template <class Backend, expression_template_option ExpressionTemplates>
inline std::ostream& operator<<(std::ostream& os, const number<Backend, ExpressionTemplates>& r)
{
   std::streamsize d  = os.precision();
   std::string     s  = r.str(d, os.flags());
   std::streamsize ss = os.width();
   if (ss > static_cast<std::streamsize>(s.size()))
   {
      char fill = os.fill();
      if ((os.flags() & std::ios_base::left) == std::ios_base::left)
         s.append(static_cast<std::string::size_type>(ss - s.size()), fill);
      else
         s.insert(static_cast<std::string::size_type>(0), static_cast<std::string::size_type>(ss - s.size()), fill);
   }
   return os << s;
}

As you can see the number is taken by const-reference, so no copy is performed. There will be allocations for the buffers (in the str() implementation). I don't think the Multiprecision library is going to boast a highly optimized implementation of IO operations.

Memory Profiling

To see exactly what allocations are done where, I ran a debug build through Massif:

enter image description here

At the peak, the top allocations are:

99.97% (73,759B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->98.54% (72,704B) 0x50DDFA4: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28
| ->98.54% (72,704B) 0x40108F1: _dl_init (dl-init.c:72)
|   ->98.54% (72,704B) 0x40010C8: ??? (in /lib/x86_64-linux-gnu/ld-2.27.so)
|     
->01.39% (1,024B) 0x58C526A: _IO_file_doallocate (filedoalloc.c:101)
| ->01.39% (1,024B) 0x58D5447: _IO_doallocbuf (genops.c:365)
|   ->01.39% (1,024B) 0x58D4566: _IO_file_overflow@@GLIBC_2.2.5 (fileops.c:759)
|     ->01.39% (1,024B) 0x58D2ABB: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1266)
|       ->01.39% (1,024B) 0x58C6A55: fwrite (iofwrite.c:39)
|         ->01.39% (1,024B) 0x516675A: std::basic_ostream<char, std::char_traits<ch
|           ->01.39% (1,024B) 0x10C85C: std::ostream& boost::multiprecision::operat
|             ->01.39% (1,024B) 0x10B6C6: main (test.cpp:8)
|               
->00.04% (31B) in 1 place, below massif's threshold (1.00%)

Sadly somehow I cannot make the threshold < 1% (this might be a documented limit).

What we can see is that though the 31B allocation happens somewhere, it is dwarfed by the file output buffer (1024B).

If we replace the output statement by just

return u.str().length();

you can still witness the 31B allocation, which does NOT match the size of the cpp_int type. Indeed, iff we were to copy THAT:

return std::make_unique<Int>(u)? 0 : 1;

THEN we see a 32B allocation instead:

->00.04% (32B) in 1 place, below massif's threshold (1.00%)

It is pretty clear that the cpp_int is not being copied, as makes sense.