I am currently working on a compiler. Recently, I stumbled upon an issue concerning the parsing of operators in an expression. Obviously I have not found this to be an issue in other languages, which makes me think that there most likely is a friendly solution to this issue that I have yet to think of.
An expression like a = 32*-1 is an expression where the * and - are seperate operators, but the developer chose to combine them. A possibility is to force operator whitespace delimiting. However, I would rather not.
My current setup for operators allows for easy implementation of new operators. However, as for now, this makes operator lookup more expensive. I would like to keep my operators from having to follow any rules; no operator should have to be devised such that it is not possible to be parsed as an operator when written. What I mean by this is for example adding a rule like if an operator starts with a parenthesis it will be split there to handle an expressions like a = (*ptr).member_value. I do not want such a rule as it then imposes a limit on starting an operator with a parenthesis.
I have thought of doing something like finding the longest consecutive operator starting from the beginning that is a valid operator and then split there. I am wondering if this could be a valid solution and whether it may cause some problems. If there are any discussion about this topic or standard algorithms that would also be deeply appreciated.
As per sepp2k's recommendation, I tried using "maximal munch" aka "longest match" to solve this and it works great for now.
See more information about the topic on the short Wikipedia page.