How can I use grep to retrieve two patterns within the same fixed-length sections of text?

98 views Asked by At

I'm trying to extract information from a file using grep and make connections between related occurrences. For example, my file may contain the following repeated pattern:

Section
Info1
etc etc
Info2

I want to be able to grep for Section and grab Info1 and Info2. I tried using an OR pattern, i.e., Info1\|Info2, but this greps all Info1 and Info2 in random order. I want Info1 and Info2 of each section be retrieved together.

All sections are the same length. There's always a fixed number of lines between Info1 and Info2. The desired output is:

Info1
Info2
Info1
Info2
...

where consecutive Info1/Info2s are from the same section Any idea how to do this?

1

There are 1 answers

0
Todd A. Jacobs On

Line-Anchored Grep

You don't need alternation or pipes for the example you posted. Given your corpus, the following works just fine:

$ grep '^Info' /tmp/foo
Info1
Info2
Info1
Info2

Unless you have lines between sections that start with Info then you don't need anything more complicated. However, assuming that your real corpus is more complicated, and that you might need to do additional processing within each section, I address fixed-length sections below.

Filtering Fixed-Length Sections with Grep

Assuming each section is exactly 4 lines, like:

Section
Info1
etc etc
Info2

Section
Info1
etc etc
Info2

then you can use the -A flag to define the amount of context after the match to return. You can then pipe that into an anchored expression that matches Info at the beginning of any line. This returns the results you wanted:

$ grep -F -A3 Section /tmp/foo | grep '^Info'
Info1
Info2
Info1
Info2