I have a given file:

application_1.pp

application_2.pp

    #application_2_version => '1.0.0.1-r1',
    application_2_version => '1.0.0.2-r3',

application_3.pp

    #application_3_version => '2.0.0.1-r4',
    application_3_version => '2.0.0.2-r7',

application_4.pp

application_5.pp

    #application_5_version => '3.0.0.1-r8',
    application_5_version => '3.0.0.2-r9',

I would like to be able to read this file and search for the string

".pp"

When that string is found, it adds that line into a variable and stores it. It then reads the next line of the file. If it encounters a line preceded by a # it ignores it and moves onto the next line.

If it comes across a line that does not contain ".pp" and doesn't start with # it should print out that line next to a the last stored variable in a new file.

The output would look like this:

application_1.pp
application_2.pp    application_2_version => '1.0.0.2-r3',  
application_3.pp    application_3_version => '2.0.0.2-r7',
application_4.pp
application_5.pp    application_5_version => '3.0.0.2-r9',

I would like to achieve this with awk. If somebody knows how to do this and it is a simple solution i would be happy if they could share it with me. If it is more complex, it would be helpful to know what in awk I need to understand in order to know how to do this (arrays, variables, etc). Can it even be achieved with awk or is another tool necessary?

Thanks,

2

There are 2 answers

1
Wintermute On BEST ANSWER

I'd say

awk '/\.pp/ { if(NR != 1) print line; line = $0; next } NF != 0 && substr($1, 1, 1) != "#" { line = line $0 } END { print line }' filename

This works as follows:

/\.pp/ {                                # if a line contains ".pp"
  if(NR != 1) {                         # unless we just started
    print line                          # print the last assembled line
  }
  line = $0                             # and remember this new one
  next                                  # and we're done here.
}

NF != 0 && substr($1, 1, 1) != "#" {    # otherwise, unless the line is empty
                                        # or a comment
  line = line $0                        # append it to the line we're building
}

END {                                   # in the end,
  print line                            # print the last line.
}
3
bkmoney On

You can use sed:

#n
/\.pp/{
    h
    :loop
    n
    /[^#]application.*version/{
        H
        g
        s/\n[[:space:]]*/\t/
        p
        b
    }
    /\.pp/{
        x
        p
    }
    b loop
}

If you save this as s.sed and run

sed -f s.sed file

You will get this output

application_1.pp
application_2.pp    application_2_version => '1.0.0.2-r3',
application_3.pp    application_3_version => '2.0.0.2-r7',
application_4.pp
application_5.pp    application_5_version => '3.0.0.2-r9',

Explanation

The #n supresses normal output.

Once we match the /\.pp/, we store that line into the hold space with h, and start the loop.

We go to the next line with n

If it matches /[^#]application.*version/, meaning it doesn't start with a #, then we append the line to the hold space with H, then copy the hold space to the pattern space with g, and substitute the newline and any subsequent whitespace for a tab. Finally we print with p, and skip to the end of the script with b

If it matches /\.pp/, then we swap the pattern and hold spaces with x, and print with p.