To match a string starting with dog
, followed by cat
(but not consuming cat
), this works:
local lpeg = require 'lpeg'
local str1 = 'dogcat'
local patt1 = lpeg.C(lpeg.P('dog')) * #lpeg.P('cat')
print(lpeg.match(patt1, str1))
Output: dog
To match a string starting with dog
, followed with any character sequences, then followed by cat
(but not consuming it), like the regex lookahead (dog.+?)(?=cat)
, I tried this:
local str2 = 'dog and cat'
local patt2 = lpeg.C(lpeg.P("dog") * lpeg.P(1) ^ 1) * #lpeg.P("cat")
print(lpeg.match(patt2, str2))
My expected result is dog and
, but it returns nil
.
If I throws away the lookahead part (i.e, using the pattern lpeg.C(lpeg.P("dog") * lpeg.P(1) ^ 1)
), it can match the whole string successfully. This means * lpeg.P(1) ^ 1
part matches any character sequence correctly, isn't it?
How to fix it?
You need to negate "cat" at each position in the lookahead that can match:
I think it's appropriate to plug the debugger I've been working on (pegdebug), as it helps in cases like this. Here is the output it generates for the original lpeg-expression:
You can see that the Separator expression "eats" all the characters, including "cat" and there is nothing left to match against
P"cat"
.The output for the modified expression looks like this:
Here is the full script: