Why C11 standard doesn't drop unsafe strcat(),strcpy() functions?

2.4k views Asked by At

C11 & C++14 standards have dropped gets() function that is inherently insecure & leads to security problems because it doesn't performs bounds checking results in buffer overflow. Then why C11 standard doesn't drop strcat() & strcpy() functions? strcat() function doesn't check to see whether second string will fit in the 1st array. strcpy() function also contains no provision for checking boundary of target array. What if the source array has more characters than destination array can hold? Most probably program will crash at runtime.

So, wouldn't it be nice if these two unsafe functions completely removed from the language? Why they are still exist? What is the reason? Wouldn't it is fine to have only functions like strncat(),strncpy()? If I am not wrong Microsoft C & C++ compiler provides safe versions of these functions strcpy_s(),strcat_s(). Then why they aren't officially implemented by other C compilers to provide safety?

5

There are 5 answers

2
Keith Thompson On BEST ANSWER

gets() is inherently unsafe, because in general it can overflow the target if too much data is received on stdin. This:

char s[MANY];
gets(s);

will cause undefined behavior if more than MANY characters are entered, and there is typically nothing the program can do to prevent it.

strcpy() and strcat() can be used completely safely, since they can overflow the target only if the source string is too long to be contained in the target array. The source string is contained in an array object that is under the control of the program itself, not of any external input. For example, this:

char s[100];
strcpy(s, "hello");
strcat(s, ", ");
strcat(s, "world");

cannot possibly overflow unless the program itself is modified.

strncat() can be used as a safer version of strcat() -- as long as you specify the third argument correctly. One problem with strncat() is that it only gives you one way of handling the case where there's not enough room in the target array: it silently truncates the string. Sometimes that might be what you want, but sometimes you might want to detect the overflow and do something about it.

As for strncpy(), it is not simply a safer version of strcpy(). It's not inherently dangerous, but if you're not very careful you can easily leave the target array without a terminating '\0' null character, leading to undefined behavior next time you pass it to a function expecting a pointer to a string. As it happens, I've written about this.

7
Gopi On

What you are talking about is scenarios which will lead to undefined behavior.

Let's say

char a[3] = "string";
for(i=0;i<5;i++)
printf("%c\n",a[i]);

You have array out of bound access and the standard hasn't removed this because it is you who is assigning the value and it is under your control.

Same with strcpy() and strcat() .

So standard can't remove all scenarios leading to UB.

Whereas gets() we know is not under the programmers control and it is taking data from some stream and you never know what the input might be and there is a high probability you might end up with buffer overflow so it has been removed and a safer function fgets() has been added.

1
Yu Hao On

strcpy and strcat aren't similar to gets. The problem of gets is, it's used to read from input, so it's out of the programmer's control whether there will be buffer overflow.


C99 Rational explains strncpy as:

Rationale for International Standard — Programming Languages — C §7.21.2.4 The strncpy function

strncpy was initially introduced into the C library to deal with fixed-length name fields in structures such as directory entries. Such fields are not used in the same way as strings: the trailing null is unnecessary for a maximum-length field, and setting trailing bytes for shorter 5 names to null assures efficient field-wise comparisons. strncpy is not by origin a “bounded strcpy,” and the Committee preferred to recognize existing practice rather than alter the function to better suit it to such use.

0
P.P On

When removing a function completely, one of the major things the standards have to mainly consider is how much of code it could break and how many people (programmers, library writers, compiler vendors, etc) would be annoyed (or would oppose) with the change.

gets() was deprecated from LSB (Linux Standard Base). POSIX-2008 made it obsolete and gets() has been historically known to be a seriously bad function and has always been strongly discouraged to use in any code. Pretty much every C programmer knew it's seriously dangerous to use gets(). So the chances of its removal breaking any production code is very very little, it not, non-existing. So it was easy to remove gets() from C11 for the committee.

But it's not the case with strcpy, strcat, etc. They can be used safely and it's still being used by many programmers in new code. While they can be subject to be buffer overflow, it's mostly programmer's control while gets() isn't.

There can be argument made to use snprintf in place of strcpy and strcat. But it would seem pointless in simple cases like:

char buf[256];
strcpy(buf, "hello");

(if buf was a pointer, then the allocate size need to tracked for use in snprintf)

because as a programmer, I know, the above is perfectly safe. More importantly a lot of legacy code would break. Basically, there's no such strong arguments can be made to remove strcpy, etc functions as they can be used safely.

3
Lundin On
  • Myth 1: strcpy() is unsafe and how it works comes as a great surprise to a veteran C programmer.
  • Myth 2: strncpy() is safe.
  • Myth 3: strncpy() is a safer version of strcpy().
  • Myth 4: Microsoft is some kind of authority of the use of the C language and know what they are talking about.

strcat() and strcpy() are perfectly safe functions.

Also note that strncpy was never intended to be a safe version of strcpy. It is used for an obscure, obsolete string format used in an ancient version of Unix. strncpy is actually very unsafe (one of many blog post about it here), unlike strcpy, since very few programmers seem to be able to use the former without producing fatal bugs (no null termination).

A better question is why the inherently unsafe strncpy() wasn't removed from the language. Is anyone working with obscure Unix strings from the 1970s much?