c++ boost MPI & threading - serialize errors: Address not mapped

1.7k views Asked by At

I'm stumped. all_gather works for primitives (e.g. int) but fails even for simple STL containers. valgrind claims that the container was not allocated/initialized, but that doesn't seem right.

In summary:

  • I do some multi-threading with openMP, then rejoin threads.
  • In serial, I try to all_gather a simple std::map using `boost::mpi::all_gather. The MPI ranks are not the threads. (There are 2 MPI ranks, and each MPI rank has 4 threads).
  • Then I intend to do some more (isolated) multi-threading.

It seems so straightforward... what could possibly be going on here?

main.cpp

#include <openmpi/mpi.h>
#include <omp.h>
#include <boost/mpi.hpp>    
#include "globals.h"

int main(int argc, char* argv[])
{        

    int provided_MPI;
    MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided_MPI );

    boost::mpi::environment my_boost_mpi_env(argc, argv);
    boost::mpi::communicator world_MPI_boost;        
    world_MPI_boost_ptr = &world_MPI_boost;
        // ^^^ global variable of type   boost::mpi::communicator *

    perform_complete_variable_elimination_schedule();
    //...

}

Conn_Comp.cpp

#include <boost/mpi.hpp>    
#include <boost/mpi/collectives.hpp>
#include <boost/serialization/serialization.hpp>
#include <boost/serialization/vector.hpp>
#include <boost/serialization/map.hpp>

#include "globals.h"

...

void perform_complete_variable_elimination_schedule()
{

    // isolated work in parallel using OpenMP
    #pragma omp parallel
    { 
    //work
    }    

    // SERIAL REGION (with respect to threading).

    std::map<uint,uint> my_map;
    std::vector< std::map<uint,uint> >   vec_of_my_maps;

    boost::mpi::all_gather<    std::map<uint,uint>    >
                     (*world_MPI_boost_ptr,
                      my_map,
                      vec_of_my_maps);  //  <--- line 293 (referenced by valgrind)


    // more isolated work in parallel using OpenMP
    #pragma omp parallel
    { 
    //work
    }

}

valgrind complains that the vector of map results in an invalid read. But this vector was created immediately preceding the all_gather call - so it is obviously in scope and not in parallel-threaded region. selected valgrind error output:

==12665== Use of uninitialised value of size 4
==12665==    at 0x41C8D7A: boost::archive::detail::basic_iarchive::get_library_version() const (basic_iarchive.cpp:575)
==12665==    by 0x41C92C6: boost::archive::detail::basic_iarchive::load_object(void*, boost::archive::detail::basic_iserializer const&) (basic_iarchive.cpp:399)
==12665==    by 0x80F5696: void boost::mpi::all_gather<std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > > >(boost::mpi::communicator const&, std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > > const&, std::vector<std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > >, std::allocator<std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > > > >&) (iserializer.hpp:387)
==12665==    by 0x80DEC83: Conn_Comp::perform_complete_variable_elimination_schedule() (Conn_Comp.cpp:**293**)
==12665==    by 0x80C840A: main (main.cpp:695)
==12665== 
==12665== Invalid read of size 2
==12665==    at 0x41C8D7A: boost::archive::detail::basic_iarchive::get_library_version() const (basic_iarchive.cpp:575)
==12665==    by 0x41C92C6: boost::archive::detail::basic_iarchive::load_object(void*, boost::archive::detail::basic_iserializer const&) (basic_iarchive.cpp:399)
==12665==    by 0x80F5696: void boost::mpi::all_gather<std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > > >(boost::mpi::communicator const&, std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > > const&, std::vector<std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > >, std::allocator<std::map<unsigned int, unsigned int, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > > > >&) (iserializer.hpp:387)
==12665==    by 0x80DEC83: Conn_Comp::perform_complete_variable_elimination_schedule() (main.cpp:**293**)
==12665==    by 0x80C840A: main (main.cpp:695)
==12665==  Address 0x3580bece is not stack'd, malloc'd or (recently) free'd
==12665== 
[drosphila:12665] *** Process received signal ***
[drosphila:12665] Signal: Segmentation fault (11)
[drosphila:12665] Signal code: Address not mapped (1)
[drosphila:12665] Failing at address: 0x3580bece
[drosphila:12665] [ 0] /lib/i686/cmov/libpthread.so.0(+0xe500) [0x44f8500]
[drosphila:12665] [ 1] /usr/lib/libboost_serialization.so.1.42.0(_ZN5boost7archive6detail14basic_iarchive11load_objectEPvRKNS1_17basic_iserializerE+0x1b7) [0x41c92c7]
[drosphila:12665] [ 2] ./detect_NAHR(_ZN5boost3mpi10all_gatherISt3mapIjjSt4lessIjESaISt4pairIKjjEEEEEvRKNS0_12communicatorERKT_RSt6vectorISD_SaISD_EE+0x587) [0x80f5697]
[drosphila:12665] [ 3] ./detect_NAHR(_ZN9Conn_Comp46perform_complete_variable_elimination_scheduleEv+0x534) [0x80dec84]
[drosphila:12665] [ 4] ./detect_NAHR(main+0xf5b) [0x80c840b]
[drosphila:12665] [ 5] /lib/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0x4519ca6]
[drosphila:12665] [ 6] ./detect_NAHR() [0x80c73e1]
[drosphila:12665] *** End of error message ***

I use MPI_Init_thread based on a recommendation from a boost help page.

As I said at the top, if I use a primitive (i.e. just uint) instead of a map, then the all_gather works fine. Why should the map fail? boost serialize already has methods for serializing STL containers, so that is not the problem...

Note also that the vector which will hold all of the values is automatically resized in all_gather (I checked the implementation for all_gather) to be big enough to hold everything. regardless, even if I initialize it myself, it still fails.

Finally, even if I use a plain old array (properly allocated) e.g. std::map<uint,uint> *, I get the same problem.

1

There are 1 answers

1
cmo On

Well, this is embarrassing. I'm going to leave the question up in case anybody else has the same strange errors.

The problem with my code was actually in the makefile. I forgot to link to the boost library for MPI.

incorrect makefile flags:

-I$(BOOST_INCLUDE)     -lboost_serialization   -lboost_mpi 

Apparently that line contains just enough information to allow the program to compile and run, but results in a runtime error.

Correct makefile flags:

-L$(BOOST_LIB) -ldl -Wl,-rpath,$(BOOST_LIB) -lboost_serialization -lboost_mpi

(Notice the addition of the library linking flags).