In C programming whenever I use fgetc(file)
to read all the chars until the end of the file it works. But when I use the similar fscanf(file, "%c")
function it prints strange characters.
Code:
#include <stdio.h>
#include <stdlib.h>
int main() {
char c;
FILE * file = fopen("D\\filename.txt", "r");
while (c != EOF) {
fscanf(file, "%c", &c);
printf("%c", c);
}
return 0;
}
But when I use fgetc
instead of fscanf
, it works. And it prints each character which is present in the file.
Can anybody answer why it works like this?
Notice that
is undefined behavior (here I am explaining why you should be afraid of it, even when a program seems to apparently "work"), and every good C compiler (e.g. GCC to be invoked as
gcc -Wall -Wextra -g
) should warn you about that (if you enable all warnings). When coding in C you should also learn how to use the debugger (e.g.gdb
).You should read documentation of fscanf(3). You probably want to code
You'll better take the habit of initializing every variable; a good optimizing compiler would remove that initialization if it is useless, and would often warn you about unitialized variables otherwise.
Notice that using fgetc(3) in your case is probably preferable. Then you need to declare
c
as an integer, not a character, and code:Notice that in the above loop the
feof(file)
would never be true (becausefgetc
would have givenEOF
before), so you'll better replacewhile(!feof(file))
withwhile(true)
It is simpler to read (by other developers, or even yourself in a couple of months) working on the same code, and it is very probably faster. Most implementations of
fscanf
are based somehow onfgetc
or a very related thing.Also, take the good habit of testing your input. The input file might not be as you expect.
On most recent systems, the encoding is today UTF-8. Be aware that some (human language) characters could be encoded in several bytes (e.g. French accentuated e letter
é
, or Russian yery letterЫ
, or even the Euro sign€
, or the mathematical for all sign∀
, letters or glyphs in other languages, etc....). You probably should consider using some UTF-8 library (e.g. libunistring) if you care about that (and you should care about UTF-8 in serious software!).Nota Bene: If you are young and learning programming, better (IMNSHO) learn Scheme with SICP, using e.g. Racket before learning C or Java. C is really not for beginners IMHO.
PS the character type (often a byte) is
char
in lower cases.