I wish to perform data flow analysis on some binaries, and to ease the process of the transformation and analysis and to also be able to support multiple architecture in a generalized way, I intend to uplift the asm to three-address code IL. I was wondering if there's some known generalized way to achieve this, because right now I'm experimenting on Aarch64 asm code and I just have a dictionary between each instruction mnemonic to object representing this three-address code IL with appropriate expressions, for example:
...
"mneg": lambda inst: [HLI(op=Op.ASGN,
result=Name(value=get_inst_operand_reg_at_index(inst, 0)),
arg=UnaryOp(op=UnaryOperator.NEGATIVE,
operand=BinOp(left=get_expr_for_operand(inst, 1),
op=BinaryOperator.MUL,
right=get_expr_for_operand(inst, 2))),
...
This seems to tedious to write for every instruction/architecture and I was wondering if I'm missing out on some already studied field where there's a better approach to this? I saw some papers about reinforcement learning algorithms, but I'm looking for something simpler, or is this the only way doing this by hand?