Using AWK to process two different files consecutively

1.2k views Asked by At

I am trying to evaluate two files consecutively with awk. At the end of the first file I am reading a date and I use that date as input for the evaluation of the second file. Unfortunately I have some problems understanding how to detect the end of the first file read the date and continue evaluating the next file. I have found some answers such as FNR==NR, unfortunately, I am not able to implement them correctly. I tried a poor man’s solution by hardcoding the number of lines. However, this is not a terribly smart thing to do. I still have problems processing the second file though:

    BEGIN initalize the counters 



    {
    if(NR==FNR) <<<<<< this is needed to run properly, only NR==FNR fails, why ?!       
    {     
          # file_1      
          do -> from the last line of the first file extract a date 

          next << what is the meaning of this ??
    }                        

    {
          # file_2
          do -> read every line of the second file 
             and sum up the values form one of the colums


    }


    }


    END { divide the sum accumulated form file=2 
          by the time calculated form the last line of file=1}

# for calling the script use :
awk -f SCRIPT file_1 file_2

#example files
# file1 last line
version 1.5 code 11 mpi start /01/12/2014/ 18:33:12 end /01/12/2014/ 20:05:12

#file2

     1.28371E-05    0.2060    0.2060   -8   -8    0    0    0
     1.91616E-05    0.1927    0.1927   -7   -8    0    0    0
     1.27306E-05    0.1567    0.1567   -6   -8    0    0    0
     2.11623E-05    0.1523    0.1523   -5   -8    0    0    0
     1.67914E-05    0.1721    0.1721   -4   -8    0    0    0
     1.47247E-05    0.1851    0.1851   -3   -8    0    0    0
     1.32049E-05    0.1919    0.1919   -2   -8    0    0    0
     1.81256E-05    0.2130    0.2130   -1   -8    0    0    0
     2.63500E-05    0.1745    0.1745    0   -8    0    0    0
     1.99232E-05    0.1592    0.1592    1   -8    0    0    0
     2.08924E-05    0.1537    0.1537    2   -8    0    0    0
     2.44922E-05    0.1459    0.1459    3   -8    0    0    0
     2.53759E-05    0.1902    0.1902    4   -8    0    0    0
     2.30230E-05    0.1708    0.1708    5   -8    0    0    0
     2.10723E-05    0.1636    0.1636    6   -8    0    0    0
     1.86613E-05    0.1915    0.1915    7   -8    0    0    0
     2.05359E-05    0.1649    0.1649    8   -8    0    0    0
     1.09533E-05    0.1765    0.1765   -8   -7    0    0    0
     1.56917E-05    0.1740    0.1740   -7   -7    0    0    0
     1.52199E-05    0.2145    0.2145   -6   -7    0    0    0
     .....   

I would appreciate any help, Thank you in advance

Alex

4

There are 4 answers

4
Ed Morton On BEST ANSWER

It sounds like all you need is something like:

awk '
NR==FNR {
   do file1 stuff
   date = $0
   next
}
{
   do file2 stuff using the variable "date" which is set to the last line of file1
}
' file1 file2

If that's not all you need, post some sample input and expected output to help clarify what you're trying to do.

3
Jan On

I set variables on the command line to accomplish this:

awk 'F==1 {print "one: ", $0} F==2 {print "two: ", $0}' F=1 one.txt F=2 two.txt

Whenever something of the form x=y is encountered, it sets the variable x in awk to y.

0
Mark Setchell On

If you just want the date from the last line of the first file and the contents of the second file for processing by awk, you can do this and make life easier:

(tail -1 firstfile; cat secondfile ) | awk 'something' -

Of course, if the date is not exactly the last line, you could do something like this:

(grep ^Date firstfile; cat secondfile ) | awk 'something' -

This way you will only have a single "file/stream" to deal with in awk and the first line will be your date.

1
Chris Seymour On

You can do this a couple of ways:

  • Buffer each line and check when FNR==1

Something like:

awk 'FNR==1 && NR!=1{print line,"is last in first file"}NR>1{print line}{line=$0} '
  • If you are using gawk you can use the ENDFILE block.

Or:

gawk '{print $0} ENDFILE && !f {print $0,"is last line in first file", f=1}'