why does the 'head' command read the string starting with '\0' correctly, with fgets doing wrong?

53 views Asked by At

here the file I wanted to read out, with some unexpected '\0' inside

   1 
   2 Mon Aug 28 14:29:29 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 1 ont-info 
   3 Mon Aug 28 14:29:33 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 1 ont-unbound 
   4 Mon Aug 28 14:29:41 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
   5 Mon Aug 28 14:29:43 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
   6 Mon Aug 28 14:29:44 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
   7 Mon Aug 28 14:29:47 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
   8 Mon Aug 28 14:29:51 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
   9 Mon Aug 28 14:29:53 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
  10 Mon Aug 28 14:31:18 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
  11 Mon Aug 28 14:33:50 2023 admin[::ffff:192.168.7.116]:59927: exi
  12 Mon Aug 28 14:33:51 2023 admin[::ffff:192.168.7.116]:59927: exi
  13 Mon Aug 28 14:33:54 2023 admin[::ffff:192.168.7.116]:59927: save config 
           14^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@Mon Aug 28 14:37:00 2023 admin[::ffff:192.168.7.116]:60207: show slot 
  15 Mon Aug 28 14:37:01 2023 admin[::ffff:192.168.7.116]:60207: show slot
  16 Mon Aug 28 14:37:03 2023 admin[::ffff:192.168.7.116]:60207: show slot
  17 Mon Aug 28 14:37:07 2023 admin[::ffff:192.168.7.116]:60207: show slot
  18 Mon Aug 28 14:37:21 2023 admin[::ffff:192.168.7.116]:60207: show slot
  19 Mon Aug 28 14:37:22 2023 admin[::ffff:192.168.7.116]:60207: show slot
  20 Mon Aug 28 14:37:27 2023 admin[::ffff:192.168.7.116]:60207: show slot

here is my code with two ways to read it out to the shell, one using fgets(), and the other using linux 'head' cmd;

  1 #include <stdio.h>
  2 #include <stdlib.h>
  3 #include <string.h>
  4 
  5 const int n = 20;
  6 
  7 #if 0
  8 int main(void)
  9 {
 10     FILE *fp = NULL;
 11     char line[1024];
 12     int i = 0;
 13 
 14     fp = fopen("/home/zyh/test/cli_log", "r");
 15     if (fp == NULL)
 16     {
 17         perror("failed to open cli_log\r\n");
 18         return -1;
 19     }
 20 
 21     while(fgets(line, sizeof(line), fp) != NULL)
 22     {
 23         printf("%s", line);
 24         i++;
 25         if (i >= n)
 26             break;
 27     }
 28     fclose(fp);
 29 
 30     return 0;
 31 }
 32 #endif
 33 
 34 int main(void)
 35 {
 36     char cmd[50] = "head -n 20 /home/zyh/test/cli_log";
 37 
 38     system(cmd);
 39 
 40     return 0;
 41 }

and here are the results for two ways;

one with fgets


Mon Aug 28 14:29:29 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 1 ont-info 
Mon Aug 28 14:29:33 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 1 ont-unbound 
Mon Aug 28 14:29:41 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
Mon Aug 28 14:29:43 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
Mon Aug 28 14:29:44 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
Mon Aug 28 14:29:47 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
Mon Aug 28 14:29:51 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
Mon Aug 28 14:29:53 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
Mon Aug 28 14:31:18 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
Mon Aug 28 14:33:50 2023 admin[::ffff:192.168.7.116]:59927: exi
Mon Aug 28 14:33:51 2023 admin[::ffff:192.168.7.116]:59927: exi
Mon Aug 28 14:33:54 2023 admin[::ffff:192.168.7.116]:59927: save config 
Mon Aug 28 14:37:01 2023 admin[::ffff:192.168.7.116]:60207: show slot 
Mon Aug 28 14:37:03 2023 admin[::ffff:192.168.7.116]:60207: show slot 
Mon Aug 28 14:37:07 2023 admin[::ffff:192.168.7.116]:60207: show slot 
Mon Aug 28 14:37:21 2023 admin[::ffff:192.168.7.116]:60207: show slot 
Mon Aug 28 14:37:22 2023 admin[::ffff:192.168.7.116]:60207: show slot 
Mon Aug 28 14:37:27 2023 admin[::ffff:192.168.7.116]:60207: show slot 

one with head cmd


Mon Aug 28 14:29:29 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 1 ont-info 
Mon Aug 28 14:29:33 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 1 ont-unbound 
Mon Aug 28 14:29:41 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
Mon Aug 28 14:29:43 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
Mon Aug 28 14:29:44 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
Mon Aug 28 14:29:47 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
Mon Aug 28 14:29:51 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
Mon Aug 28 14:29:53 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
Mon Aug 28 14:31:18 2023 admin[::ffff:192.168.7.116]:59927: brief-show slot 2 ont-info 
Mon Aug 28 14:33:50 2023 admin[::ffff:192.168.7.116]:59927: exi
Mon Aug 28 14:33:51 2023 admin[::ffff:192.168.7.116]:59927: exi
Mon Aug 28 14:33:54 2023 admin[::ffff:192.168.7.116]:59927: save config 
Mon Aug 28 14:37:00 2023 admin[::ffff:192.168.7.116]:60207: show slot 
Mon Aug 28 14:37:01 2023 admin[::ffff:192.168.7.116]:60207: show slot 
Mon Aug 28 14:37:03 2023 admin[::ffff:192.168.7.116]:60207: show slot 
Mon Aug 28 14:37:07 2023 admin[::ffff:192.168.7.116]:60207: show slot 
Mon Aug 28 14:37:21 2023 admin[::ffff:192.168.7.116]:60207: show slot 
Mon Aug 28 14:37:22 2023 admin[::ffff:192.168.7.116]:60207: show slot 
Mon Aug 28 14:37:27 2023 admin[::ffff:192.168.7.116]:60207: show slot 

you can see the wrong line with time "14:37:00" gets printed out with head cmd, while with fgets doesn't; So I'm confused about how head cmd prints out string staring with '\0'?

Is head cmd erase the all '\0',I try to push the output of head cmd into file, head -n 20 cli_log > head_log, but I open head_log, and find that it's same as cli_log.

3

There are 3 answers

0
0___________ On

fgets is a string file function and \0 has a very special meaning in C strings. That is the reason why it does not work as you think it should.

Do not string functions if your file is not properly formatted.

You need to treat this file as a binary file and read it accordingly

0
Wyatt Carpenter On

Your fgets-using program prints using the line printf("%s", line);. Since \0 is the string termination character in C strings, a line that starts with \0 looks like an empty string to printf, and therefore it prints nothing. The unix head command, however, can deal with all bytes that might occur in a file, including a byte with a value of 0.

0
chux - Reinstate Monica On

fgets() can read a null character, save it in its buffer and continue reading. That is not the direct problem.

It is printf("%s", line); that is the issue as printing stops on the first null character in line, be it a null character read in by fgets() or the null character appended by fgets() when it was done.

Instead of printf("%s", line);, print the n characters read with fwrite(line, 1, n, stdout).

With fgets(), finding the number of characters read n is problematic.


Alternative 1:

Read a line which might contain null characters and record the number of characters read.

Illustrative code (uncheck) which uses "%4096[^\n]" to read much of a line (not the '\n') and "%n" to record the number of characters read.

char line[4096+1];
int len;
int conversion_cnt;
while (i < n && (conversion_cnt = fscanf(fp, "%4096[^\n]%n", line, &len)) != EOF) {
  if (conversion_cnt == 1) {
    fwrite(line, 1, len, stdout);
  } else {
    int ch = fgetc(fp);
    if (ch == EOF) {
      break;
    } 
    assert(ch == '\n');  // Only '\n' expected here.
    i++;
    fputc(ch, stdout);
  } 
}

Alternative 2:

Read blocks of data with fread() and stop once the 20th '\n' found. Use memchr() or use a loop to find the '\n'.

char block[4096];
size_t len;
while (i < n && (len = fread(block, 1, sizeof block, fp)) > 0) {
  for (size_t j = 0; j < len; j++) {
    if (block[j] == '\n') {
      i++;
      if (i >= n) {
        len = j + 1;
        break;
      } 
    }
  }   
  fwrite(line, 1, len, stdout);
}