I'm using yecc to parse my tokenized asm-like code. After providing code like "MOV [1], [2]\nJMP hello"
and after lexer'ing, this is what I'm getting in response.
[{:opcode, 1, :MOV}, {:register, 1, 1}, {:",", 1}, {:register, 1, 2},
{:opcode, 2, :JMP}, {:identifer, 2, :hello}]
When I parse this I'm getting
[%{operation: [:MOV, [:REGISTER, 1], [:REGISTER, 2]]},
%{operation: [:JMP, [:CONST, :hello]]}]
But I want every operation to have line number in order to get meaningful errors further in code.
So I changed my parser to this:
Nonterminals
code statement operation value.
Terminals
label identifer integer ',' opcode register address address_in_register line_number.
Rootsymbol code.
code -> line_number statement : [{get_line('$1'), '$2'}].
code -> line_number statement code : [{get_line('$1'), '$2'} | '$3'].
%code -> statement : ['$1'].
%code -> statement code : ['$1' | '$2'].
statement -> label : #{'label' => label('$1')}.
statement -> operation : #{'operation' => '$1'}.
operation -> opcode value ',' value : [operation('$1'), '$2', '$4'].
operation -> opcode value : [operation('$1'), '$2'].
operation -> opcode identifer : [operation('$1'), value('$2')].
operation -> opcode : [operation('$1')].
value -> integer : value('$1').
value -> register : value('$1').
value -> address : value('$1').
value -> address_in_register : value('$1').
Erlang code.
get_line({_, Line, _}) -> Line.
operation({opcode, _, OpcodeName}) -> OpcodeName.
label({label, _, Value}) -> Value.
value({identifer, _, Value}) -> ['CONST', Value];
value({integer, _, Value}) -> ['CONST', Value];
value({register, _, Value}) -> ['REGISTER', Value];
value({address, _, Value}) -> ['ADDRESS', Value];
value({address_in_register, _, Value}) -> ['ADDRESS_IN_REGISTER', Value].
(commented code
is old, working rule)
Now I'm getting
{:error, {1, :assembler_parser, ['syntax error before: ', ['\'MOV\'']]}}
After providing same input. How to fix this?
My suggestion is to keep the line numbers in the tokens and not as separate tokens and then change how you build the operations.
So I would suggest this:
Or even this if you want to mirror Elixir AST: