I have to count the number of records I have in 6 files, each file contains 4 million records (the count should be as fast as possible), however there is another file with a similar name which should be omitted.
fileSales_1.txt (4 million records)
fileSales_2.txt (4 million records)
fileSales_3.txt (4 million records)
fileSales_4.txt (4 million records)
fileSales_5.txt (4 million records)
fileSales_6.txt (4 million records)
fileSales_unique.txt (24 million records)
I'm counting the logs with the following command: awk 'END {pint NR}' fileSales_*.txt
However, in doing so, the fileSales_unique.txt archive also counts, giving a total of 48 million records
Could you help me with an instruction which only counts the number of records for files 1 to 6? The result should be 24 million records, awk 'END {pint NR}' fileSales_(1 to 6).txt
Suppose you have these files (using
wcto show both file names and size):There are many ways to achieve your goal, but two primary ones:
Inclusion glob:
wc -l fileSales_{1..6}.txtwc -l fileSales_?.txtwc -l fileSales_[1-6].txtAny of those:
(Same concept applies to
awk)Or, maintain a
skiparray in Bash:Then your method works:
Know that
wcin this case will be monumentally faster than awk likely...