Different results from awk and nawk

Question

Different results from awk and nawk

4.4k views Asked by Ankit At 13 September 2013 at 14:49

I just ran these two commands on a file having around 250 million records.

awk '{if(substr($0,472,1)=="9") print $0}' < file1.txt >> file2.txt

and

nawk '{if(substr($0,472,1)=="9") print $0}' < file1.txt >> file2.txt

The record length is 482. The first command gave the correct number of records in file2.txt i.e.; 60 million but the nawk command gives only 4.2 million.

I am confused and would like to know if someone has come across issue like this. How exactly this simple command being treated in a different way internally? Is there a buffer which can hold only up to certain number of bytes while using nawk?

would appreciate if someone can throw some light on this.

My OS details are

SunOS <hostname> 5.10 Generic_147148-26 i86pc i386 i86pc

Original Q&A

There are 2 answers

Ed Morton On 13 September 2013 at 17:09

Your command can be reduced to just this:

awk 'substr($0,472,1)==9'

On Solaris (which you are on) when you run awk by default you are running old, broken awk (/usr/bin/awk) so I suspect that nawk is the one producing the correct result.

Run /usr/xpg4/bin/awk with the same script/arguments and see which of your other results it's output agrees with.

Also, check if your input file was created on Windows by running dos2unix on it and see if it's size changes and, if so, re-run your awk commands on the modified files. If it was created on Windows then it will have some control-Ms in there that could be causing chaos.

**konsolebox** · Accepted Answer · 2013-09-13T15:18:02+00:00

konsolebox On 13 September 2013 at 15:18 BEST ANSWER

The difference probably lies on the buffer limit of Nawk. One of the records (lines) found in your input file has probably exceeded it.

This crucial line can be found in awk.h:

#define RECSIZE (8 * 1024)  /* sets limit on records, fields, etc., etc. */

TechQA.

Different results from awk and nawk

There are 2 answers

Related Questions in UNIX

Related Questions in AWK

Related Questions in NAWK

Popular Questions

Popular Tags

Trending Questions