C++ - How to recover istream if self defined extractor fails

398 views Asked by At

I need a self defined extractor (operator>>) to read a specific string into my own datatype.

The problem is that the requirements for the string are large.

Hence the easiest way is probably to read the whole string from the istream and then check if all requirements are fulfilled.

My Problem is if the string is not valid. Up to my knowledge it is common in C++ that the stream is unchanged.

What is best practice to recover the istream in this case? Is the exception handling in the following example enough?

std::istream& operator>>(std::istream& is, Foo& f)
{
    std::string str;

    if (is >> str)
    {
        // check if string is valid
        if ( is_valid( str ) )
        {
            // set new values in f
        }
        else
        {
            // recover stream
            std::for_each(str.rbegin(), str.rend(),
                          [&] (char c)
            {
                is.putback(c);
            });

            // ste failbit
            is.clear(std::ios_base::failbit);
        }
    }

    return is;
}

And what about std::getline() instead of is >> str ? Are there other pitfalls?

Thanks

Marco

1

There are 1 answers

0
Dietmar Kühl On BEST ANSWER

You can't get streams back to the initial position where you started reading, at least not in general. In theory, you can put back characters or seek to a location where you had been before but many stream buffers don't support putting back characters or seeking. The standard library gives some limited guidance but it deals with rather simple types, e.g., integers: the characters are read as long as the format matches and it stops just there. Even if the format matches, there may be some errors which could have been detected earlier.

Here is a test program demonstrating the standard library behavior:

#include <iostream>
#include <sstream>

void test(std::string const& input)
{
    std::istringstream in(input);
    int i;
    std::string tail;

    bool result(in >> i);
    in.clear();
    std::getline(in, tail);
    std::cout << "input='" << input << "' "
              << "fail=" << std::boolalpha << result << " "
              << "tail='" << tail << "'\n";
}

int main()
{
    test("10 y");
    test("-x y");
    test("0123456789 x");
    test("123456789012345678901234567890 x");
}

Just to explain the four test cases:

  1. Just to make sure the test does what it is meant to do, the first input is actually OK and there is no problem.
  2. The second input starts with a character matching the format followed by something not matching and reading stops right after the '-' character.
  3. The third test reads an int using octal numbers. The failure could have been detected upon the character '8' but both the '8' and the '9' are consumed and the input fails.
  4. The last example results in an overflow which could be detected before all digits are read but still all digits are read.

Based on that, I'd think there wouldn't be an expectation to reset the stream to the original position when semantics checks on a well-formed input fail.