How can I write a regular expression to recognize the plus operator and the plus sign?

156 views Asked by At

I am trying to write a lexical analyzer for C# language, but I can't figure out how can I differentiate the plus sign from the plus operator, except the context. I need the next token from the source file. So, when I encounter a + how do I now it refers to a declaration of some integer, real, whatever or it refers to + operator? How can my scannig function differentiate these two situations appropriately? The case is similar to this < and <=, <<, but in my situation next character does't help every time.

int a = +1;
a=2 + 3;
1

There are 1 answers

4
Lucas Trzesniewski On BEST ANSWER

I am trying to write a lexical analyzer for C# language

OK, but you misplaced the your lexer/parser separation bar here.

The lexer's job is to "cut" the input string into tokens. The parser's job is to interpret these. Your lexer should just detect the + operator, emit the corresponding token, and that's it.

Then, your parser, which has context knowledge (ie it knows which part of an expression it is trying to parse at a given moment) is in a much better position to make the difference between an unary and a binary operator. The lexer simply lacks the necessary information.

Obviously, you shouldn't include the - sign either into number tokens.

Here are some lexing examples:

int a=+1; --> int a = + 1 ;

a=2+3; --> a = 2 + 3 ;

Note the + 1 in the first case. Your lexer shouldn't emit +1.