sscanf(s, "%u", &v) matching signed integers

1.6k views Asked by At

After Cppcheck was complaining about "%u" as the wrong format specifier to scan into an int variable, I changed the format into "%d", but when having a second look on it before committing the change, I thought that the intention could be to prevent for negative inputs. I wrote two small programs to see the difference:

Specifier %d

#include <iostream>
#include <stdlib.h>
using namespace std;

int main() {
    const char* s = "-4";
    int value = -1;
    int res = sscanf(s, "%d", &value);
    cout << "value:" << value << endl;
    cout << "res:" << res << endl;
    return 0;
}

see also https://ideone.com/OR3IKN

Specifier %u

#include <iostream>
#include <stdlib.h>
using namespace std;

int main() {
    const char* s = "-4";
    int value = -1;
    int res = sscanf(s, "%u", &value);
    cout << "value:" << value << endl;
    cout << "res:" << res << endl;
    return 0;
}

see also https://ideone.com/WPWdqi

Result(s)

Surprisingly in both conversion specifiers accept the sign:

value:-4
res:1

I had a look into the documentation on cppreference.com. For C (scanf, fscanf, sscanf, scanf_s, fscanf_s, sscanf_s - cppreference.com) as well as C++ (std::scanf, std::fscanf, std::sscanf - cppreference.com) the description for the "%u" conversion specifier is the same (emphasis mine):

matches an unsigned decimal integer.
The format of the number is the same as expected by strtoul() with the value 10 for the base argument.

Is the observed behaviour standard complient? Where can I find this documented?

[Update] Undefined Behaviour, really, why?

I read that it was simply UB, well, to add to the confusion, here is the version declaring value as unsigned https://ideone.com/nNBkqN - I think the assignment of -1 is still as expected, but "%u" obviously still matches the sign:

#include <iostream>
#include <stdlib.h>

using namespace std;

int main() {
    const char* s = "-4";
    unsigned value = -1;
    cout << "value before:" << value << endl;
    int res = sscanf(s, "%u", &value);
    cout << "value after:" << value << endl;
    cout << "res:" << res << endl;
    return 0;
}

Result:

value before:4294967295
value after:4294967292
res:1
2

There are 2 answers

1
T.C. On BEST ANSWER

There are two separate issues.

  1. %u expects a unsigned int* argument; passing a int* is UB.
  2. Does %u match -4? Yes. The expected format is that of strtoul with base 10, and if you read the documentation it's quite clear that a leading minus sign is allowed.
1
Bathsheba On

No, it's not standard compliant. In fact the behaviour of your program is undefined: the format specifier for sscanf must match the types of the arguments.