Understanding composition of lazy range-based functions

124 views Asked by At

TL;DR

What's wrong with the last commented block of lines below?

// headers and definitions are in the down the question
int main() {

    std::vector<int> v{10,20,30};

    using type_of_temp = std::vector<std::pair<std::vector<int>,int>>;

    // seems to work, I think it does work
    auto temp = copy_range<type_of_temp>(v | indexed(0)
                                           | transformed(complex_keep_index));
    auto w = temp | transformed(distribute);
    print(w);

    // shows undefined behavior
    //auto z = v | indexed(0)
    //           | transformed(complex_keep_index)
    //           | transformed(distribute);
    //print(z);
}

Or, in other words, what makes piping v | indexed(0) into transformed(complex_keep_index) well defined, but piping v | indexed(0) | transformed(complex_keep_index) into transformed(distribute) undefined behavior?

Extended version

I have a container of things,

std::vector<int> v{10,20,30};

and I have a function which generates another container from each of those things,

// this is in general a computation of type
//      T -> std::vector<U>
constexpr auto complex_comput = [](auto const& x){
    return std::vector{x,x+1,x+2}; // in general the number of elements changes
};

so if I was to apply the complex_comput to v, I'd get,

{{10, 11, 12}, {20, 21, 22}, {30, 31, 32}}

and if I was to also concatenate the results, I'd finally get this:

{10, 11, 12, 20, 21, 22, 30, 31, 32}

However, I want to keep track of the index where each number came from, in a way that the result would encode something like this:

0 10
0 11
0 12
1 20
1 21
1 22
2 30
2 31
2 32

To accomplish this, I (eventually) came up with this solution, where I attempted to make use of ranges from Boost. Specifically I do the following:

  1. use boost::adaptors::indexed to attach the index to each element of v
  2. transform each resulting "pair" in a std::pair storing the index and result of the application of complex_comput to the value,
  3. and finally transforming each std::pair<st::vector<int>,int> in a std::vector<std::pair<int,int>>.

However, I had to give up on the range between 2 and 3, using a helper "true" std::vector in between the two transformations.

#include <boost/range/adaptor/indexed.hpp>
#include <boost/range/adaptor/transformed.hpp>
#include <boost/range/iterator_range_core.hpp>
#include <iostream>
#include <utility>
#include <vector>

using boost::adaptors::indexed;
using boost::adaptors::transformed;
using boost::copy_range;

constexpr auto complex_comput = [](auto const& x){
// this is in general a computation of type
//      T -> std::vector<U>
    return std::vector{x,x+1,x+2};
};
constexpr auto complex_keep_index = [](auto const& x){
    return std::make_pair(complex_comput(x.value()), x.index());
};
constexpr auto distribute = [](auto const& pair){
    return pair.first | transformed([n = pair.second](auto x){
        return std::make_pair(x, n);
    });
};

template<typename T>
void print(T const& w) {
    for (auto const& elem : w) {
        for (auto i : elem) {
            std::cout << i.second << ':' << i.first << ' ';
        }
        std::cout << std::endl;
    }
}

int main() {

    std::vector<int> v{10,20,30};

    using type_of_temp = std::vector<std::pair<std::vector<int>,int>>;

    auto temp = copy_range<type_of_temp>(v | indexed(0)
                                           | transformed(complex_keep_index));
    auto w = temp | transformed(distribute);
    print(w);

    //auto z = v | indexed(0)
    //           | transformed(complex_keep_index)
    //           | transformed(distribute);
    //print(z);
}

Indeed, decommenting the lines defining and using z gives you a code that compiles but generates rubbish results, i.e. undefined behavior. Note that applying copy_range<type_of_temp> to the first, working, range is necessary, otherwise the resulting code is essetially the same as the one on the right of auto z =.

Why do I have to do so? What are the details that makes the oneliner not work?

I partly understand the reason, and I'll list my understanding/thoughts in the following, but I'm asking this question to get a thorough explanation of all the details of this.

  • I understand that the undefined behavior I observe stems from z being a range whose defining a view on some temporary which has been destroyed;
  • given the working version of the code, it is apparent that temporary is v | indexed(0) | transformed(complex_keep_index);
  • however, isn't v | indexed(0) itself a temporary that is fed to transformed(complex_keep_index)?
  • Probably one important detail is that the expression v | indexed(0) is no more than a lazy range, which evaluates nothing, but just sets things up such that when one iterates on the range the computations is done; after all I can easily do v | indexed(0) | indexed(0) | indexed(0), which is well defined;
  • and also the whole v | indexed(0) | transformed(complex_keep_index) is well defined, otherwise the code above using w would probably misbehave (I know that UB doesn't mean that the result has to show something is wrong, and things could just look ok on this hardware, in this moment, and break tomorrow).
  • So there's something inherently wrong is passing an rvalue to transformed(distribute);
  • but what's wrong in doing so lies in distribute, not in transformed, because for instance changing distribute to [](auto x){ return x; } seems to be well defined.
  • So what's wrong with distribute? Here's the code
constexpr auto distribute = [](auto const& pair){
    return pair.first | transformed([n = pair.second](auto x){
        return std::make_pair(x, n);
    });
};
  • What's the problem with it? The returned range (output of this transformed) will hold some iterators/pointers/references to pair.first which is part of goes out of scope when distribute returns, but pair is a reference to something in the caller, which keeps living, right?
  • However I know that even though a const reference (e.g. pair) can keep a temporary (e.g. the elements of v | indexed(0) | transformed(complex_keep_index)) alive, that doesn't mean that the temporary stays alive when that reference goes out of scope just because it is in turn referenced by something else (references/pointers/iterators in the output of transformed([n = …](…){ … })) which doesn't go out of scope.

I think/hope that probably the answer is already in what I've written above, however I need some help to streamline all of that so that I can understand it once and for all.

0

There are 0 answers