A ubiquitous phrase in literature describing the Truffle language implementation framework is "Tree Rewriting". It's often said that this allows tree nodes to become more specialized to their operands (like their types).
However, in tutorials such as this one and this one, there is no tree rewriting as far as I can see. Nodes become specialized by mutating a state field in them that tells them what's their current assumption about the type of their operands, and then branching on the value of that field.
So presumably, what handles AST Node rewriting is the Truffle framework itself.
So my question is : What does Truffle literature mean when it mentions "Tree Rewriting" ? When exactly is a Node told to replace itself ? and how does it replace itself ? and what does that gain us ?
I have read lots of papers and seen tons of presentation on Truffle and Graal and it's never clear to me what's meant by this.
Very good questions! Let me try to shed some light.
Tree rewriting (or AST Self specializing), as you pointed out, is a technique in which the AST node can replace itself with a more specialized node when it observes a situation where the more specialized node will perform the job better (e.g. faster execution for a certain input type).
Now, one could implement this literally:
And historically some Truffle Interprets were written in this way and you can still see the API supporting this (e.g. https://www.graalvm.org/truffle/javadoc/com/oracle/truffle/api/nodes/Node.html#replace-T-).
This is all fine and dandy, but if you have a lot of specializations (as one does in say TruffleRuby) then the approach starts to become slow and puts pressure on the garbage collector.
To avoid the allocations the Truffle DSL generates a single node which does not "replace" itself literally when rewriting but rather rewrites itself by changing it's own state (though a literal
state
field) to track which specializations are "active". That way, truffle can allocate just one node in the AST and do the rewrites internally on that node.So, TL;DR; Node Rewriting is implemented as state changes in the node itself for performance reasons.