ANTLR4 for Function Pointers in C

58 views Asked by At

I am using ANTLR4 to parse C code (.h files), and I wanted to extract the function signatures (function name, function return type, and function parameters) as well as function pointers right now.
In the future, I will try and expand this to structs,enums typedefs etc as well. I am doing this so that I can check compatibility between C(.h) files.
The problem I am facing right now is that, if there are two declarations in the .h files like this :
int *fp(int,int);
int *(*function2())(int, int);
My code treats these two declarations as the same, even though one is a function pointer, and the other is a function returning a pointer that is used a function pointer, I want to store fp in one dictionary, and function2() in another dictionary.
The code I am using right now is this :

from CListener import CListener
from CParser import CParser
class FileStructure(CListener) :
    def __init__(self) :
        self.function_signatures= {}
        self.function_pointer = {}

    def print_definitions(self) :
        print("Function Signatures :",self.function_signatures)
        print("Function Pointers :",self.function_pointer)
    def extractIdentifier(self, directDeclarator):
        """
        This function extracts the identifier name from a directDeclarator context.
        If the directDeclarator context contains an Identifier, it returns its text.
        If the context is wrapped in parentheses, it recursively looks for the Identifier.
        """
        # Base case: If the directDeclarator contains an Identifier, return its text
        if directDeclarator.directDeclarator():
            return directDeclarator.directDeclarator().getText()
        
        # Recursive case: If the directDeclarator is wrapped in parentheses, recurse into it
        nestedDeclarator = directDeclarator.declarator()
        if nestedDeclarator and nestedDeclarator.directDeclarator():
            return self.extractIdentifier(nestedDeclarator.directDeclarator())
        
        # If no identifier is found, return an empty string
        return ""

    def getParameters(self, parameterTypeListCtx):
        """
        This function extracts parameters and their types from a parameterTypeList context.
        It iterates over each parameterDeclaration context within the parameterTypeList.
        Each parameter's type specifier and name are extracted and added to a list.
        """
        parameters = []
        parameterListCtx = parameterTypeListCtx.parameterList()
        if parameterListCtx:
            for parameterDeclarationCtx in parameterListCtx.parameterDeclaration():
                # Extract the type specifier(s) for the parameter
                type_specifiers = [
                    token.getText() for token in parameterDeclarationCtx.declarationSpecifiers().children
                    if not isinstance(token, CParser.TypeQualifierContext)
                ]
                param_type = ' '.join(type_specifiers)

                # Extract the parameter name, if it exists
                param_name = ''
                if parameterDeclarationCtx.declarator():
                    param_name = parameterDeclarationCtx.declarator().directDeclarator().getText()

                parameters.append((param_type, param_name))
        return parameters 
    def enterDeclaration(self, ctx: CParser.DeclarationContext):
        if ctx.initDeclaratorList():
            for initDeclaration in ctx.initDeclaratorList().initDeclarator():
                declarator = initDeclaration.declarator()
                directDeclarator = declarator.directDeclarator()

                # Handle functions returning a pointer (to data, function, or array)
                if declarator.pointer():
                    # Function returning a pointer to a function
                    if directDeclarator and directDeclarator.parameterTypeList():
                        function_name = self.extractIdentifier(directDeclarator)
                        print("function" , function_name)
                        self.function_pointer[function_name] = True
                    else:
                        # Function returning a pointer (to data or array)
                        print("else statement")
                        if directDeclarator and hasattr(directDeclarator, 'directDeclarator'):
                            # Check for an array type after the pointer
                            if any(hasattr(child, 'typeQualifierList') or hasattr(child, 'assignmentExpression') for child in directDeclarator.children):
                                # Function returning a pointer to an array
                                function_name = self.extractIdentifier(directDeclarator)
                                self.function_pointer[function_name] = True
                            else:
                                # Function returning a pointer to data (not a function or array)
                                function_name = self.extractIdentifier(directDeclarator)
                                print(function_name)
                                self.function_pointer[function_name] = True
                else:
                    # Handle normal function definitions
                    if directDeclarator and directDeclarator.parameterTypeList():
                        function_name = directDeclarator.directDeclarator().getText()
                        print(function_name)
                        return_type = ctx.declarationSpecifiers().getText()
                        parameter_list = self.getParameters(directDeclarator.parameterTypeList())
                        self.function_signatures[function_name] = (return_type, parameter_list)

However the line

                        # Function returning a pointer (to data or array)
                        print("else statement")

is not getting executed, and the pointers are getting treated as functions. I am using the grammar that is provided here.
And functions like void function5(int arr[5]); are treated as normal functions, however functions like int (*function4())[5]; are not recognized at all.
Is there anyway I can store fp in the function_pointer dict and function2() in the function_signature dictionary?

So far I am treating both of these the same way, by storing all function pointers in the function_pointer dictionary, and then doing a string check with other .h files, which is a naive checking mechanism, and to improve this I was thinking of changing the dictionaries, as I can then club normal function checks along with the function returning pointers.

0

There are 0 answers