I have some strange behavior while using both Ubuntu and Ubuntu on Win 10 (WSL). Both versions are 18.04 and GCC, and I'm compiling with these flags: -std=c++11 -g.

My problem is that I have this regular expression: ^[\\t ]*(?:([.A-Za-z0-9_]+[:]))?(?:[\\t ]*([A-Za-z]{2,4})(?:[\\t ]+(@[A-Za-z0-9_]+(?:(?:\\+|-)[0-9]+)?|\".+?\"|\'.+?\'|[.A-Za-z0-9_]+)(?:[\\t ]*[,][\\t ]*(@[A-Za-z0-9_]+(?:(?:\\+|-)[0-9]+)?|\".+?\"|\'.+?\'|[.A-Za-z0-9_]+))?(?:[\\t ]*[,][\\t ]*(@[A-Za-z0-9_]+(?:(?:\\+|-)[0-9]+)?|\".+?\"|\'.+?\'|[.A-Za-z0-9_]+))?)?)?

for matching some assembler-like string (yeah, I know it's long).

I'm using it like this :

std::ifstream infile(this->inFile.c_str());
while (std::getline(infile, line)) {
    lineNumber++;
    line = line.substr(0, line.find("#"));
    if (line == "")
        continue;
    line = reduce(line);

    if (regex_match(line, m, op_reg)) {
        std::vector<std::string> ops;
        ops.push_back(std::to_string(lineNumber));
        for (int i = 1; i < m.size(); i++) {
            if (m[i] != "") {
                ops.push_back(m[i]);
            }
        }
        this->maps.opMap.push_back(ops);
    }
}

This piece of code is working well on my native Ubuntu installation, but on the WSL version of Ubuntu regex_match always returns false.

Have some of you already encountered this? And if so, how have you managed it ?

1 Answers

0
Jeremy Talus On

Like wp78de figure out it's was the windows style line ending \r\n that messed up the regex, beacause the \r will be in the string

so I made this little funtions, for my string utils lib :

void chomp(std::string& str) {
    const auto pos = str.find_last_not_of("\r\n");
    str.erase(pos + 1);
}

this little guy remove the remaining \r in my string, just before test the string with the regex