Replace empty field at the end of line using awk

387 views Asked by At

I am stuck. Say I have tab separated file with 4 columns.

A326931873  2014-10-26 09:55:28.000 N06 D00030131051410
A326931902  2014-10-26 09:56:10.000 N06 

And want to replace empty field at the end of the line with \N. The result should be

A326931873  2014-10-26 09:55:28.000 N06 D00030131051410
A326931902  2014-10-26 09:56:10.000 N06 \N

I have tried awk -F'\t' 'BEGIN {sub("", "\N", $4); print}' file, awk -F'\t' 'BEGIN {sub(/^&/, "\N", $4); print}' file and alike with no luck.

Update1: As suggested I have also tried using number of fields, but awk -F"\t" 'NF<4 {print}' file returns no results, that is all lines have four fields.

Update2: cat -vET file shows

A326931873^I2014-10-26 09:55:28.000^IN06^ID00030131051410^M$
A326931902^I2014-10-26 09:56:10.000^IN06^I^M$

Update3: after removing carriage returns cat -vET file shows

A326931873^I2014-10-26 09:55:28.000^IN06^ID00030131051410$
A326931902^I2014-10-26 09:56:10.000^IN06^I$

and any of the suggested solutions does not work.

4

There are 4 answers

1
danas.zuokas On

So I have figured out the answer myself awk -F'\t' -v OFS='\t' 'length($4)==1{$4="\N"}1' file

0
Kent On

I assume that for the line with empty $4, there was a trailing <tab> try this line:

awk -F'\t' '$NF=!$NF?"\\N":$NF' file

If there is no <tab> before the empty field:

awk -F'\t' -v OFS="\t" 'NF==3{$4="\\N"}7' file
0
fedorqui On

If you know how many fields there should be, you can provide that value to the script.

Then, in case the current line has less fields you can append this given text as the last one:

awk -v fields=5 -F"\t" 'NF<fields {$fields="\\N"} 1' file

For your given input it returns:

A326931873 2014-10-26 09:55:28.000 N06 D00030131051410
A326931902 2014-10-26 09:56:10.000 N06 \N

Notes:

  • this works because NF stands for Number of fields, so that it indicates how many fields the current line has.
  • we have to say $fields="\\N" because \ needs to be escaped to be handled properly. Otherwise, if you just use \N, you might get an error like this:

warning: escape sequence \N' treated as plainN'

0
nu11p01n73R On

You can use a very simple format like

$ awk 'NF<5{$5="\\N"}1' temp
A326931873  2014-10-26 09:55:28.000 N06 D00030131051410
A326931902 2014-10-26 09:56:10.000 N06 \N