Modifying python AST while preserving comments

2k views Asked by At

I am currently working with the AST in python. I take in a python file, generate its AST, modify it, and then recompile back to source code. I'm using a transformer that adds a getter to a class (I am using a visitor pattern with ast.NodeTransformer). Currently my code works as expected but does not preserve comments, which is my issue. Below is my code:

#visits nodes and generates getters or setters
def genGet(file,type,func):
    global things
    things['func'] = func
    things['type'] = type
    with open(file) as f:
        code = f.read()             #get the code
    tree = ast.parse(code)          #make the AST from the code
    genTransformer().visit(tree)    #lets generate getters or setters depending on type argument given in our transformer so the genTransformer function
    source = meta.asttools.dump_python_source(tree) #recompile the modified ast to source code
    newfile = "{}{}".format(file[:-3],"_mod.py")
    print "attempting to write source code new file: {}".format(newfile) #tell everyone we will write our new source code to a file
    outputfile = open(newfile,'w+')
    outputfile.write(source)        #write our new source code to a file
    outputfile.close()


class genTransformer(ast.NodeTransformer):
    ...

I have done some research on lib2to3 which apparently can preserve comments but have not found anything as of yet that helps with my problem. For example, I found the code below but don't really understand it. It appears to preserve comments but not allow my modifications. I get a missing attribute error when it runs.

import lib2to3
from lib2to3.pgen2 import driver
from lib2to3 import pygram, pytree
import ast

def main():
    filename = "%s" % ("exfunctions.py")
    with open(filename) as f:
        code = f.read()
    drv = driver.Driver(pygram.python_grammar, pytree.convert)
    tree = drv.parse_string(code, True)
    # ast transfomer breaks if it is placed here
    print str(tree)
    return

I am having trouble finding a package or strategy to preserve comments whilst transforming an AST. Thus far my research has not helped me. What can I use that will allow me to modify an AST but also preserve the comments?

1

There are 1 answers

0
Lai Jimmy On

LibCST is a Python Concrete Syntax tree parser and toolkit which can be used to solve your problem. It provides a syntax tree looks like ast but with formatting info preserved. It also provides transformer pattern for tree modification.

https://github.com/Instagram/LibCST/

https://libcst.readthedocs.io/en/latest/index.html

import libcst as cst

class NameTransformer(cst.CSTTransformer):
    def leave_Name(self, original_node, updated_node):
        return cst.Name(updated_node.value.upper())

With a NameTransformer like this, we can transform all names in source code to upper case:

>>> m = cst.parse_module("def fn(): return (value)")
>>> m.visit(NameTransformer()).code

'def FN(): return VALUE'