I am working on a project using pycparser for parsing C source code.
Accordingly with https://gcc.gnu.org/onlinedocs/cpp/Preprocessor-Output.html when I run preprocessor I have # linenum filename flags linemarkers in my preprocessed translation unit.
However, when I parse the output of gcc -E
using pycparser the tokens embed the coordinates (file, line, column) but they seem not to include any information from the linemarkers' flag that would be very useful to me.
Any solution or advice to include also linemarkers in my AST or embed the information in the token of the AST?
UPDATE What i need is to pass through my tokens and understand the file they belong to (and this is already in pycparser) but also how this file has been included.
This information are in the flag field of the lienmarkers introduced by the preprocessor. Indeed if i have:
<tokens of file1.h>
<tokens of file2.h>
<tokens of main.c>
the inclusion of file2.h could have been either in file1.h or in main.c. I need to extract this info using pycparser. I know i can use gcc -H and do a lot of analysis and processing etc to get rid of it. However, flag element of the linemarks reports if i am opening a or returning from a file, so it includes the info about nested inclusion. Is this info somewhere in pycparser? Can it be simply added somehow?
pycparser
understands#line
directives and incorporates them into the coordinates it tracks for all tokens. For example, consider this file:We can dump its AST:
Note that the declaration of
b
has the location/tmp/file.c:3:5
, meaning line 3 (and column 5) of the file.Now modify the file slightly to be:
And dump AST again:
See what happened to the location of
b
?