Python NodeTransformer: how to remove nodes?

4.1k views Asked by At

I'm playing around with AST manipulations. Currently I'm trying to remove certain nodes from an input AST. The NodeTransformer class is an appropriate tool for this purpose, I think. Sadly, it doesn't behave as expected.

The documetation says:

"The NodeTransformer will walk the AST and use the return value of the visitor methods to replace or remove the old node. If the return value of the visitor method is None, the node will be removed from its location, otherwise it is replaced with the return value."

Now look at my program:

import _ast
import ast
import sys

#ast transformer
class MyTransformer(ast.NodeTransformer):

    def iterate_children(self, node):
        """
        helper
        """
        children = ast.iter_child_nodes(node)
        for c in children:
            self.visit(c)

    def generic_visit(self, node):
        """
        default behaviour
        """
        print("visiting: "+node.__class__.__name__)
        self.iterate_children(node)
        return node

    def visit_For(self, node):
        """
        For nodes: replace with nothing
        """
        print("removing a For node")
        return None



#read source program
filename = sys.argv[1]
with open (filename, "r") as myfile:
    source = str(myfile.read())

#compile source to ast
m = compile(source, "<string>", "exec", _ast.PyCF_ONLY_AST)

#do ast manipulation
t = MyTransformer()
t.visit(m)

# fix locations
m = ast.fix_missing_locations(m)

#visualize the resulting ast
#p = AstPrinter()
#p.fromAst(m)

#execute the transformed program
print("computing...")
codeobj = compile(m, '<string>', 'exec')
exec(codeobj)

Here is the input file:

l = [0, 1, 2, 3]

total = 0

for i in l:
    total += i

print(total)

And the outcome:

visiting: Module
visiting: Assign
visiting: Name
visiting: Store
visiting: List
visiting: Num
visiting: Num
visiting: Num
visiting: Num
visiting: Load
visiting: Assign
visiting: Name
visiting: Store
visiting: Num
removing a For node
visiting: Expr
visiting: Call
visiting: Name
visiting: Load
visiting: Name
visiting: Load
computing...
6

I expected a '0', because the loop has been removed. But there is a '6' (=0+1+2+3).

Does anybody know why?

Python version: 3.2.3

ast illustration

The numbers in ( ) indicate the line number in the input program. Code for image drawing is not provided here; please ignore the "root" node. As you can see, the For loop is still there.

Thanks for reading!

Update 21.8.:

I posted a link to this question on the python mailinglist ([email protected]). It seems like I did too much overwriting. Without the children visitor, it works as expected.

Entire source code of MyTransformer:

class MyTransformer(ast.NodeTransformer):
    def visit_For(self, node):
        """
        For nodes: replace with nothing
        """
        print("removing a For node")
        return None
1

There are 1 answers

0
VeLKerr On

No, it works correctly because you removed self-written generic_visit(). As you can see in the source code of ast.py, the NodeTransformer is the child of the NodeVisitor, which has its own generic_visit() method. This method performs updating of your ast nodes and if you override this method, you should know what are you doing for. Overriding will change all logic of NodeTransformer.

If you still need to override generic_visit() (for example to print messages like visiting: <AST object> when you visit a node), you have to call parent method in your generic_visit(). So, your method will be the next:

def generic_visit(self, node):
        """
        printing visit messages
        """
        super().generic_visit(node)
        print("visiting: "+node.__class__.__name__)
        self.iterate_children(node)
        return node

The iterate_children() doesn't affect the result in this case, but also must be removed. It forces the visitor to run over children of each node. But generic_visit() already runs over all nodes. So, with iterate_children() you visit some nodes more than once. This wastes computing time and may give errors in more complex cases.