How to append a large number of elements to stxxl vector efficiently?

343 views Asked by At

I need to append a large number of elements to a stxxl vector. What is the most efficient way of adding elements to a stxxl vector? Right now, I'm using push_back of the stxxl vector, but it doesn't seem very efficient. It's far from saturating the disk bandwidth. Is there a better way?

Thanks, Da

3

There are 3 answers

1
Nemo On

According to the documentation:

If one needs only to sequentially write elements to the vector in n/B I/Os the currently fastest method is stxxl::generate.

Does not really answer why push_back should be I/O-inefficient, though.

0
justin On

One approach:

  • First reserve the number of elements you need. Resizing a vector with some types can be very time consuming. Appending many elements can result in several resizes as the vector grows.

  • Once resized, append using emplace_back (or simply push if the type is trivial, e.g. int).

Also review the member functions. An implementation which suits your needs well may already exist.

0
Timo Bingmann On

Most of the things written about "Efficient Sequential Reading and Writing to Vectors" apply in your case.

Besides vector_bufwriter, which fills a vector using an imperative loop, there is also a variant of stxxl::stream::materialize() which does it in a functional programming style.

About previously knowing the vector's size: this is not really necessary for EM, since one can allocate blocks on the fly. These will then generally not be in order, but so be it, there is no guarantee on that anyway.

I see someone (me) made vector_bufwriter automatically double the vector's size if the filling reaches the vector's end. At the moment, I don't think this is necessary, maybe one should change this behaviour.