How to get part of a std::string into a streambuf without copying?

2k views Asked by At

I'm using boost asio a lot lately and I find that I'm working with std::strings and asio::streambufs quite a bit. I find that I'm trying to get data back and forth between streambufs and strings a lot as part of parsing network data. In general, I don't want to mess around with 'formatted io', so iostreams aren't very useful. I've found that while ostream::operator<<(), in spite of the official documentation, seems to relay my strings into streambufs unmolested, istream::operator>>() mangles the contents of my streambufs (as you would expect given that it's 'formatted').

It really seems to me like the standard library is missing a whole lot of iterators and stream objects for dealing with streambufs and strings and unformatted io. For example, if I want to get a substring of a string into a streambuf, how do I do that without creating a copy of the string? A basic all-in-all-out transfer can be accomplished like:

// Get a whole string into a streambuf, and then get the whole streambuf back
//  into another string
{
    boost::asio::streambuf sbuf;
    iostream os(&sbuf);
    string message("abcdefghijk lmnopqrs tuvwxyz");
    cout << "message=" << message << endl;
    os << message;
    std::istreambuf_iterator<char> sbit(&sbuf);
    std::istreambuf_iterator<char> end;
    std::string sbuf_it_wholestr(sbit, end);
    cout << "sbuf_it_wholestr=" << sbuf_it_wholestr << endl;    
}

prints:

message=abcdefghijk lmnopqrs tuvwxyz
sbuf_it_wholestr=abcdefghijk lmnopqrs tuvwxyz

If I want to get just part of a streambuf into a string, that seems really hard, because istreambuf_iterator isn't a random access iterator and doesn't support arithmetic:

// Get a whole string into a streambuf, and then get part of the streambuf back
//  into another string. We can't do this because istreambuf_iterator isn't a
//  random access iterator!
{
    boost::asio::streambuf sbuf;
    iostream os(&sbuf);
    string message("abcdefghijk lmnopqrs tuvwxyz");
    cout << "message=" << message << endl;
    os << message;
    std::istreambuf_iterator<char> sbit(&sbuf);
    // This doesn't work
    //std::istreambuf_iterator<char> end = sbit + 7; // Not random access!
    //std::string sbuf_it_partstr(sbit, end);
    //cout << "sbuf_it_partstr=" << sbuf_it_partstr << endl;    
}    

And there doesn't seem to be any way of directly using string::iterators to dump part of a string into a streambuf:

// istreambuf_iterator doesn't work in std::copy either
{
    boost::asio::streambuf sbuf;
    iostream os(&sbuf);
    string message("abcdefghijk lmnopqrs tuvwxyz");
    cout << "message=" << message << endl;
    std::istreambuf_iterator<char> sbit(&sbuf);
    //std::copy(message.begin(), message.begin()+7, sbit); // Doesn't work here
}    

I can always pull partial strings out of a streambuf if I don't mind formatted io, but I do - formatted io is almost never what I want:

// Get a whole string into a streambuf, and then pull it out using an ostream
// using formatted output
{
    boost::asio::streambuf sbuf;
    iostream os(&sbuf);
    string message("abcdefghijk lmnopqrs tuvwxyz");
    cout << "message=" << message << endl;
    string part1, part2;
    os << message;
    os >> part1;
    os >> part2;
    cout << "part1=" << part1 << endl;    
    cout << "part2=" << part2 << endl;    
}

prints:

message=abcdefghijk lmnopqrs tuvwxyz
part1=abcdefghijk
part2=lmnopqrs

If I'm ok with an ugly copy, I can generate a substring, of course - std::string::iterator is random access...

// Get a partial string into a streambuf, and then pull it out using an
//  istreambuf_iterator
{
    boost::asio::streambuf sbuf;
    iostream os(&sbuf);
    string message("abcdefghijk lmnopqrs tuvwxyz");
    cout << "message=" << message << endl;
    string part_message(message.begin(), message.begin()+7);
    os << part_message;
    cout << "part_message=" << part_message << endl;
    std::istreambuf_iterator<char> sbit(&sbuf);
    std::istreambuf_iterator<char> end;
    std::string sbuf_it_wholestr(sbit, end);
    cout << "sbuf_it_wholestr=" << sbuf_it_wholestr << endl;    
}

prints:

message=abcdefghijk lmnopqrs tuvwxyz
part_message=abcdefg
sbuf_it_wholestr=abcdefg

The stdlib also has the curiously stand-alone std::getline(), which lets you pull individual lines out of an ostream:

// If getting lines at a time was what I wanted, that can be accomplished too...          
{    
    boost::asio::streambuf sbuf;
    iostream os(&sbuf);
    string message("abcdefghijk lmnopqrs tuvwxyz\n1234 5678\n");
    cout << "message=" << message << endl;
    os << message;
    string line1, line2;
    std::getline(os, line1);
    std::getline(os, line2);
    cout << "line1=" << line1 << endl;
    cout << "line2=" << line2 << endl;
}

prints: message=abcdefghijk lmnopqrs tuvwxyz 1234 5678

line1=abcdefghijk lmnopqrs tuvwxyz
line2=1234 5678

I feel like there's some Rosetta Stone that I've missed and that dealing with std::string and asio::streambuf would be so much easier if I discovered it. Should a just abandon the std::streambuf interface and make use of asio::mutable_buffer, which I can get out of asio::streambuf::prepare()?

2

There are 2 answers

12
sehe On BEST ANSWER
  1. istream::operator>>() mangles the contents of my streambufs (as you would expect given that it's 'formatted').

    Open your input stream with std::ios::binary flag and manipulate it with is >> std::noskipws

  2. For example, if I want to get a substring of a string into a streambuf, how do I do that without creating a copy of the string? A basic all-in-all-out transfer can be accomplished like

    Try like

     outstream.write(s.begin()+start, length);
    

    Or use boost::string_ref:

     outstream << boost::string_ref(s).instr(start, length);
    

  3. And there doesn't seem to be any way of directly using string::iterators to dump part of a string into a streambuf:

     std::copy(it1, it2, ostreambuf_iterator<char>(os));
    
  4. Re. parsing the message lines:

    You can split into iterator ranges with iter_split.

    You can parse an embedded grammar on the fly with boost::spirit::istream_iterator

0
Arthur Tacca On

Unless I'm missing something, you just need std::streambuf::sputn()

my_buf.sputn(mystr.data(), mystr.size());