How to match (group2.*|^.*)group1 when no instance of groups 1,2,3, or 4 are in between?

478 views Asked by At

I'm using Python 3.4.

Suppose we have four groups composed of regular expressions

g1 = 'g11|g22|...|g1m'
g2 = 'g21|g22|...|g2n'
g3 = 'g32|g32|...|g3p'
g4 = 'g41|g42|...|g4q'

For example, g1 might be 'chickens|horses|bonnet(?>!blue )'. The groups are disjoint: no element in any of the four groups belongs to more than one group. The groups can have any number of elements greater than 1.

I want to match on a string if and only if it contains any instance of group_1 such that either :

  1. no instances of any of groups 1-4 precede said instance of group_1 or
  2. the instance of any of groups 1-4 that immediately precedes said instance of group_1 is not group_2.

Some strings on which I want to match:

  1. 'g11'
  2. 'g31 g11'
  3. 'g41g11'
  4. 'g11 g21 g11' (The second instance of g11 violates rule 2. The first instance of g11 does not and moreover rule 1 is satisfied.)
  5. 'anything or nothing g11 anything or nothing'
  6. 'anything or nothing g31 anything or nothing g11'

Some strings on which I don't want to match:

  1. 'g31 g21 g11'
  2. 'g21 g11 g31'
  3. 'anything or nothing g21 anything or nothing g11 anything or nothing'

What've tried so far:

  • I tried: (g31|g32)(?=.*?(g11|g12))(?!.*?(g21|g22)), which works for 'g31 g11' and 'g31 g21 g11' but fails if there is a g21 or g22 after g11, as in 'g31 g11 g21'.

  • I've also tried '(g31|g32).*?(g21|g22){0}.*?(g11|g22)' which works for 'g31 g11' and 'g31 g21 g11' but not 'g31 g31 g21 g11'.

1

There are 1 answers

5
vks On BEST ANSWER
^(?!(?:(?!g1|g2).)*(?:g21|g22)(?:(?!g31|g32|g41|g42).)*(?:g11|g12)).*?(?:g11|g12).*$

You can try this.See demo.

https://regex101.com/r/hI0qP0/16