Different result when count line number of a file, using wc -l and cat -n

750 views Asked by At

I heard that wc -l could count the number of lines in a file. However, when I use it to count lines of a file that was generated by Python, it gives a different result, miscounting one line.

Here is the MWE.

#!/usr/bin/env python                                                                                   

import random                                                                                           

def getRandomLines(in_str, num):                                                                        
    res  = list()                                                                                       
    lstr = len(in_str)                                                                                  
    for i in range(num):                                                                               
        res.append(''.join(random.sample(in_str, lstr)))                                                
    return res                                                                                          

def writeRandomLines(rd_lines, fname): 
    lines = '\n'.join(rd_liens)                                                                 
    with open(fname, 'w') as fout:                                                                      
        fout.write(lines)                                                                                                                                      

if __name__ == '__main__':                                                                              
    writeRandomLines(getRandomLines("foobarbazqux", 20), "example.txt")

This gives a file, example.txt, that contains 20 lines of random strings. And thus, the expection of the number of lines in example.txt is 20. However, when one applies wc -l to it, it gives 19 as the result.

$ wc -l example.txt
19 example.txt

When one uses cat -n to show the content of the file, with line number, one can see

$ cat -n example.txt
     1  oaxruzaqobfb
     2  ozbarboaufqx
     3  fbzarbuoxoaq
     4  obqfarbozaxu
     5  xoqbrauboazf
     6  ufqooxrababz
     7  rqoxafuzboab
     8  bfuaqoxaorbz
     9  baxroazfouqb
    10  rqzafoobxaub
    11  xqaoabbufzor
    12  aobxbaoruzfq
    13  buozaqbrafxo
    14  aobzoubfarxq
    15  aquofrboazbx
    16  uaoqrfobbaxz
    17  bxqubarfoazo
    18  aaxruzofbboq
    19  xuaoarzoqfbb
    20  bqouzxraobfa

Why wc -l miscount one line, and what could I do to fix this problem?

Any clues or hints will be appreciated.

3

There are 3 answers

0
fredtantini On BEST ANSWER

In your python code, you have:

    lines = '\n'.join(rd_liens)                                                                 

So what you are really writing is :

word1\nword2\n...wordX-1\nwordX

Unfortunately, in man wc:

-l, --lines
    print the newline counts 

hence your difference.

0
Kostas Drk On

Apparently wc -l needs to see a \n at the end of the line to count it as one. Your current format has the last line without a trailing \n, therefore not counted by wc -l. Add the newline and it should be fixed.

0
Vijayakumar Udupa On

wc -l only counts number of new line characters. Since you are appending lines with a '\n' characters, to join 20 lines only 19 '\n' characters were used. Hence result as 19.

If you need correct count, terminate each line with '\n'