I've got dynamic(sometime changes) array of regular expressions, for example URL path structure:
/(home)?$
-> home view/news/(index)?$
->/news/([a-zA-Z0-9_]+)/?$
-> news article view( arg_1 )
/news/([a-zA-Z0-9_]+)/reply_to_comment\?(.*)
-> news comment reply view( arg 1, arg 2 )
/(.+)
-> 404 view( arg 1 )
If there are collisions, the first expression is winner. For example last expression matches everything, but in case, that no expression before matched it. Or /news/index can by matched as article, but index is before article expression, so it wins.
I'd like to build state machine/expression tree, which I can use to match any strings in O(n) time, where n is length of string is being matched, i.e. I don't care about time needed to build that tree, but I want to have same speed of matching regardless of number of expressions.
Or at least for "limited" regex, supporting only: +
, *
, ?
, []
, [^ ]
, ()
, $
.
Treat every expression expr
not starting with ^
as if it were written as ^expr
.