Regex Lookarounds, prevent matches before and after

49 views Asked by At

I have a regex that I cannot get working right. I am using PCRE (php) to run it.

The regex looks for inch measurements written as fractions using the forward slash to separate the numerator and denominator. ex 1 3/8in or 19 15/16"

It would match 12 1/2" here:

A product description with 12 1/2" in it.

But I want it NOT to match if the measurement is part of a dimension, ie has an x before or after and matches this format: 19 3/4" x 19 5/8"

Example text that is matching incorrectly:

Product description with 19 3/4" x 19 5/8" in it.

This matches 5/8" when it is supposed to ignore all of it because of the x in there.

My regex currently knocks off the measure left of the x, but only ignores the whole number on the right side. The lookbehind will capture 5/8" from the example above. I need it to ignore both sides of the dimension and only match measurements that are by themselves. I am using negative look ahead and behind to match the x.

Regex:

/\s+(?<!x\s)\d*\s?\d+\/\d+"*\s*(in|")(?!\d*\s?x)\s*/i

I ran it through regex101.com's debugger and still can't figure it out.

1

There are 1 answers

1
Lucas Trzesniewski On

You can use the (*SKIP)(*FAIL) trick:

(?(DEFINE)(?<measure>
  (?:\d+ \s*)? \d+ / \d+ (?:in|")
))

(?&measure) \s* x \s* (?&measure) (*SKIP)(*FAIL)
| (?&measure)

Demo

The first part defines what's a measure (you can think of it like a function). Then, if we find two measures separated by x ((?&measure) \s* x \s* (?&measure)) we skip this part of the input string on failure ((*SKIP)), and then fail the match ((*FAIL)).

The other part of the alternative can then match the single measurements you're interested in.

The second part could also be written as:

(?&measure) (?: \s* x \s* (?&measure) (*SKIP)(*FAIL) )?