If "Who acted as (?P<role>.*) in (?P<movie>.*)" is the template
I want to match for queries like "Who acted as tony montana in Scarface".
If the role name has a "in" here or If the movie name has an "in", the regex match will go wrong.
Eg: "Who acted as k in men in black" will give "k in men" as role.
May be a non greedy approach will work for this query but it will go for a toss if the movie contains the word "in". How do I get all possible interpretations here?
Given a phrase like
'a in b in c in d'this will generate all possible partitions by the wordin:For your specific problem, if there are three
ins in the phrase, the "middle" interpretation ((a in b) in (c in d)) would be most probably correct, but with twoins there's no way to solve this by the means of text manipulations, because "left" and "right" partitions are equally probable, consider:You'll have to use NLP or database-driven methods to parse this correctly.