How C/C++ tokeniser/parser doesn't misunderstand the usage of '*', since it can be used for multiplication and for pointers type. eg:.
... {
...
obj *var1; // * used to make var1 as pointer to obj
var1 * var2; // * used to multiply var1 and var2
}
Update 1: While tokenising/parsing, we can't yet make difference between identifier that refers to a variable and identifier that refers to a type.
Update 2: (Context of question) I'm designing and implementing a programming language of C/C++ family, where pointers are declared like Pointer<int>
, and I want to use C-pointer style instead.
Update 3 (on Dec 30, 2016): Some answers of this stackoverflow question about LR(1) parser and C++ seem to treat my question.
The tokeniser doesn't make a distinction between the two. It just treats it as the token
*
.The parser knows how to look up names. It knows that
obj
is a type, so can parse<type> * <identifier>
differently from<non-type> * <non-type>
. Your instinct is on to something: it's not possible to parse just the syntax of C without implementing any of the semantics. The only way to get a correct parse of the C syntax requires interpreting declarations and keeping track of which names name types and which name non-types. Your update:is not quite right, since it assumes that tokenising/parsing is done all at once as a separate step. In fact, parsing and semantic analysis are interleaved. When
typedef int obj;
is parsed, it is interpreted and taken to meanobj
now names a type. When parsing continues andobj * var1;
is seen, the results of the earlier semantic analysis are available for use.