Counting lines in a file excluding the empty lines in C

5.2k views Asked by At

We have a program that will take a file as input, and then count the lines in that file, but without counting the empty lines.

There is already a post in Stack Overflow with this question, but the answer to that doesn't cover me.

Let's take a simple example.

File:

I am John\n
I am 22 years old\n
I live in England\n

If the last '\n' didn't exist, then the counting would be easy. We actually already had a function that did this here:

/* Reads a file and returns the number of lines in this file. */
uint32_t countLines(FILE *file) {
  uint32_t lines = 0;
  int32_t c;
  while (EOF != (c = fgetc(file))) {
    if (c == '\n') {
      ++lines;
    }
  }
  /* Reset the file pointer to the start of the file */
  rewind(file);
  return lines;
}

This function, when taking as input the file above, counted 4 lines. But I only want 3 lines.

I tried to fix this in many ways.

First I tried by doing fgets in every line and comparing that line with the string "\0". If a line was just "\0" with nothing else, then I thought that would solve the problem.

I also tried some other solutions but I can't really find any.

What I basically want is to check the last character in the file (excluding '\0') and checking if it is '\n'. If it is, then subtract 1 from the number of lines it previously counted (with the original function). I don't really know how to do this though. Are there any other easier ways to do this?

I would appreciate any type of help. Thanks.

4

There are 4 answers

0
d909b On BEST ANSWER

You can actually very efficiently amend this issue by keeping track of just the last character as well.

This works because empty lines have the property that the previous character must have been an \n.

/* Reads a file and returns the number of lines in this file. */
uint32_t countLines(FILE *file) {
  uint32_t lines = 0;
  int32_t c;
  int32_t last = '\n';
  while (EOF != (c = fgetc(file))) {
    if (c == '\n' && last != '\n') {
      ++lines;
    }
    last = c;
  }
  /* Reset the file pointer to the start of the file */
  rewind(file);
  return lines;
}
0
Soner from The Ottoman Empire On

Firstly, detect lines that only consist of whitespace. So let's create a function to do that.

bool stringIsOnlyWhitespace(const char * line) {
    int i;
    for (i=0; line[i] != '\0'; ++i)
        if (!isspace(line[i]))
            return false;
    return true;
}

Now that we have a test function, let's build a loop around it.

while (fgets(line, sizeof line, fp)) {
    if (! (stringIsOnlyWhitespace(line)))
        notemptyline++;
}

printf("\n The number of nonempty lines is: %d\n", notemptyline);

Source is Bill Lynch, I've little bit changed.

3
lkamp On

I think your approach using fgets() is totally fine. Try something like this:

char line[200];

while(fgets(line, 200, file) != NULL) {
    if(strlen(line) <= 1) {
        lines++;
    }
}

If you don't know about the length of the lines in your files, you may want to check if line actually contains a whole line.

Edit:

Of course this depends on how you define what an empty line is. If you define a line with only whitespaces as empty, the above code will not work, because strlen() includes whitespaces.

1
David R Tribble On

Here is a slightly better algorithm.

#include <stdio.h>

// Reads a file and returns the number of lines in it, ignoring empty lines
unsigned int countLines(FILE *file)
{
    unsigned int  lines = 0;
    int           c = '\0';
    int           pc = '\n';

    while (c = fgetc(file), c != EOF)
    {
        if (c == '\n'  &&  pc != '\n')
            lines++;
        pc = c;
    }
    if (pc != '\n')
        lines++;

    return lines;
}

Only the first newline in any sequence of newlines is counted, since all but the first newline indicate blank lines.

Note that if the file does not end with a '\n' newline character, any characters encountered (beyond the last newline) are considered a partial last line. This means that reading a file with no newlines at all returns 1.

Reading an empty file will return 0.

Reading a file ending with a single newline will return 1.

(I removed the rewind() since it is not necessary.)