How C/C++ parser/lexer makes the difference between '*' of pointer and '*' of multiplication?

1.1k views Asked by At

How C/C++ tokeniser/parser doesn't misunderstand the usage of '*', since it can be used for multiplication and for pointers type. eg:.

... {
    ...
    obj *var1; // * used to make var1 as pointer to obj
    var1 * var2; // * used to multiply var1 and var2
}

Update 1: While tokenising/parsing, we can't yet make difference between identifier that refers to a variable and identifier that refers to a type.

Update 2: (Context of question) I'm designing and implementing a programming language of C/C++ family, where pointers are declared like Pointer<int>, and I want to use C-pointer style instead.

Update 3 (on Dec 30, 2016): Some answers of this stackoverflow question about LR(1) parser and C++ seem to treat my question.

1

There are 1 answers

5
AudioBubble On BEST ANSWER

The tokeniser doesn't make a distinction between the two. It just treats it as the token *.

The parser knows how to look up names. It knows that obj is a type, so can parse <type> * <identifier> differently from <non-type> * <non-type>. Your instinct is on to something: it's not possible to parse just the syntax of C without implementing any of the semantics. The only way to get a correct parse of the C syntax requires interpreting declarations and keeping track of which names name types and which name non-types. Your update:

While tokenising/parsing, we can't yet make difference between identifier that refers to a variable and identifier that refers to a type.

is not quite right, since it assumes that tokenising/parsing is done all at once as a separate step. In fact, parsing and semantic analysis are interleaved. When typedef int obj; is parsed, it is interpreted and taken to mean obj now names a type. When parsing continues and obj * var1; is seen, the results of the earlier semantic analysis are available for use.