taking over memory from std::vector

5.8k views Asked by At

I use an external library which operates on large quantities of data. The data is passed in by a raw pointer, plus the length. The library does not claim ownership of the pointer, but invokes a provided callback function (with the same two arguments) when it is done with the data.

The data gets prepared conveniently by using std::vector<T>, and I'd rather not give up this convenience. Copying the data is completely out of the question. Thus, I need a way to "take over" the memory buffer owned by an std::vector<T>, and (later on) deallocate it in the callback.

My current solution looks as follows:

std::vector<T> input = prepare_input();
T * data = input.data();
size_t size = input.size();
// move the vector to "raw" storage, to prevent deallocation
alignas(std::vector<T>) char temp[sizeof(std::vector<T>)];
new (temp) std::vector<T>(std::move(input));
// invoke the library
lib::startProcesing(data, size);

and, in the callback function:

void callback(T * data, size_t size) {
    std::allocator<T>().deallocate(data, size);
}

This solution works, because the standard allocator's deallocate function ignores its second argument (the element count) and simply calls ::operator delete(data). If it did not, bad things could happen, as the size of the input vector might be quite a bit smaller than its capacity.

My question is: is there a reliable (wrt. the C++ standard) way of taking over the buffer of std::vector and releasing it "manually" at some later time?

3

There are 3 answers

1
Michael Anderson On BEST ANSWER

You can't take ownership of the memory from a vector, but you can solve your underlying problem another way.

Here's how I'd approach it - its a bit hacky because of the static global variable and not thread safe, but it can be made so with some simple locking around accesses to the registry object.

static std::map<T*, std::vector<T>*> registry;
void my_startProcessing(std::vector<T> * data) {
  registry.put(data->data(), data);
  lib::startProcesing(data->data(), data->size());
}

void my_callback(T * data, size_t length) {
  std::vector<T> * original = registry.get(data);
  delete original;
  registry.remove(data);
}

Now you can just do

std::vector<T> * input = ...
my_startProcessing(input);

But watch out! Bad things will happen if you add/remove elements to the input after you've called my_startProcessing - the buffer the library has may be invalidated. (You may be allowed to change values in the vector, as I believe that will write through the to data correctly, but that will depend on what the library allows too.)

Also this doesn't work if T=bool since std::vector<bool>::data() doesn't work.

0
d453 On

You could create custom class build over a vector.

Key point here is to use move semantics in SomeData constructor.

  • you're getting prepared data without copying (note that source vector will be cleared)
  • data will be correctly disposed by thisData vector destructor
  • source vector can be disposed with no issue

Since underlying datatype is going to be array you can calculate start pointer and a data size (see SomeDataImpl.h below):

SomeData.h

#pragma once
#include <vector>

template<typename T>
class SomeData
{
    std::vector<T> thisData;

public:
    SomeData(std::vector<T> && other);

    const T* Start() const;
    size_t Size() const;
};

#include "SomeDataImpl.h"

SomeDataImpl.h

#pragma once

template<typename T>
SomeData<T>::SomeData(std::vector<T> && otherData) : thisData(std::move(otherData)) { }

template<typename T>
const T* SomeData<T>::Start() const {
    return thisData.data();
}

template<typename T>
size_t SomeData<T>::Size() const {
    return sizeof(T) * thisData.size();
}

Usage example:

#include <iostream>
#include "SomeData.h"

template<typename T>
void Print(const T * start, size_t size) {
    size_t toPrint = size / sizeof(T);
    size_t printed = 0;

    while(printed < toPrint) {
        std::cout << *(start + printed) << ", " << start + printed << std::endl;
        ++printed;
    }
}

int main () {
    std::vector<int> ints;
    ints.push_back(1);
    ints.push_back(2);
    ints.push_back(3);

    SomeData<int> someData(std::move(ints));
    Print<int>(someData.Start(), someData.Size());

  return 0;
}
0
alvion On

You can't do this in any kind of portable way, but you CAN do it in a way that will probably work on most C++ implementations. This code seems to work after a quick test on VS 2017.

#include <iostream>

#include <vector>

using namespace std;

template <typename T>
T* HACK_stealVectorMemory(vector<T>&& toStealFrom)
{
    // Get a pointer to the vector's memory allocation
    T* vectorMemory = &toStealFrom[0];

    // Construct an empty vector in some stack memory using placement new
    unsigned char buffer[sizeof(vector<T>)];
    vector<T>* fakeVector = new (&buffer) vector<T>();

    // Move the memory pointer from toCopy into our fakeVector, which will never be destroyed.
    (*fakeVector) = std::move(toStealFrom);

    return vectorMemory;
}

int main()
{
    vector<int> someInts = { 1, 2, 3, 4 };
    cout << someInts.size() << endl;

    int* intsPtr = HACK_stealVectorMemory(std::move(someInts));

    cout << someInts.size() << endl;

    cout << intsPtr[0] << ", " << intsPtr[3] << endl;

    delete intsPtr;
}

Output:

4
0
1, 4