Another pearl from the early IOCCC years is Larry Wall's 1986 entry:
http://www.ioccc.org/years.html#1986 (wall)
I suspect there is no C compiler today that can actually compile that source straight-out-of-the-box, due to the severe preprocessor abuse it contains:
- Latest
TDM-GCC 9.2.0
set to ANSI mode fails - Last
TCC 0.9.27
fails
However, after wrenching the preprocessed code out of the obfuscated original (always with GCC's cpp -traditional
), both TCC and GCC manage to compile it; nevertheless, GCC's effort goes to waste as the program chokes when trying to begin decoding its obfuscated intro text (not going to spoil that here for those who want to delve themselves!)
TCC, on the other hand, manages to summarily warn about implicit declarations of system()
, read()
and write()
and quickly produce a working program.
I tried to step through the GCC code execution with GDB and that's how I found that the compiled GCC code chokes on the second pass of a for
loop that goes through a text string in order to decode it:
[Inferior 1 (process 9460) exited with code 030000000005]
That process ID doesn't matter as it represents the debug build executable that crashes. The exit code, however, stays the same.
Clearly, TCC is better suited to such IOCCC entries that GCC. The latter still manages to successfully compile and even run some entries, but for tough cases like this one, TCC is hard to beat. Its only drawback is that it falls short when preprocessing extremely abusive code such as this example. It leaves spaces between certain preprocessed entries and thus fails in concatenating them into the author's intended C keywords in places, whereas GCC's cpp
works 100%.
My question is, as philosophical or even rhetorical as it sounds:
What is it in modern GCC that makes it either fail to compile, or produce unusable code when it does compile, earlier C programs, unlike TCC?
Thanks in advance for all the feedback, I appreciate it!
NOTE: I am using Windows 10 version 2004 with WSL 2; GCC fails in both the Windows and the WSL 2 environments. I am planning to compile TCC in WSL 2 for comparisons in that environment too.
PS: I immensely enjoyed this program when it finally executed as intended. It undoubtedly deserves that year's "grand prize in most well-rounded in confusion"!
Undefined behaviour. Which was more of a rule. Just look at this classic 1984 entry.
The C compilers nowadays compile C as set forth in the ISO 9899 standard, whose first revision was published in 1990 (or 1989). The program predates that. Notably, it uses some really odd traditional preprocessor syntax that is invalid in C89, C99, C11 and so forth.
The idea generally is that you do not want to allow this syntax by default because a traditional preprocessor would not produce code compatible with modern preprocessor - for example a traditional preprocessor would replace macros within strings too:
preprocesses to
The program is valid C89, although bad style; but it would preprocess to
So it is best to error out at any sign of non-standard preprocessor usage, otherwise the code could be subtly broken because such substitutions would not have been made.
Another thing writeable strings. The deobfuscation code directly attempts to modify the string literals. C89 specified that this had undefined behaviour - these cause a crash because they're mapped in read-only pages in GCC-compiled program. Older GCC versions supported
-fwriteable-strings
but it was deprecated a looooong time ago, because it was buggy anyway.I got the program running by these minimal changes with GCC 9.3.0.
-traditional
is no longer supported with compilation, so you must preprocess first and compile after that:I.e. I wrapped every thing that looks like a string literal
"..."
that's not within a compiler line directive (a line starting with#
) into a(char[]){"..."}
array compound literal - as is known, compound literals have scoped storage duration and non-const qualified ones are writable.