I am trying to write a lexical analyzer for C# language, but I can't figure out how can I differentiate the plus sign from the plus operator, except the context. I need the next token from the source file. So, when I encounter a + how do I now it refers to a declaration of some integer, real, whatever or it refers to + operator? How can my scannig function differentiate these two situations appropriately? The case is similar to this < and <=, <<, but in my situation next character does't help every time.
int a = +1;
a=2 + 3;
OK, but you misplaced the your lexer/parser separation bar here.
The lexer's job is to "cut" the input string into tokens. The parser's job is to interpret these. Your lexer should just detect the
+
operator, emit the corresponding token, and that's it.Then, your parser, which has context knowledge (ie it knows which part of an expression it is trying to parse at a given moment) is in a much better position to make the difference between an unary and a binary operator. The lexer simply lacks the necessary information.
Obviously, you shouldn't include the
-
sign either into number tokens.Here are some lexing examples:
int a=+1;
-->int
a
=
+
1
;
a=2+3;
-->a
=
2
+
3
;
Note the
+
1
in the first case. Your lexer shouldn't emit+1
.