Parsing arithmetic expression with variable using menhir meet priority problem

53 views Asked by At

parser.mly

%{
  let env=Hashtbl.create 10
%}
%token <int> INT
%token SUB
%token EOL
%token EQUAL
%token NAME
%left SUB 
%right EQUAL

%start main             /* the entry point */
%type<int> main
%%
 main:
  expr EOL                { $1 }
;

expr:
   INT                     { $1} 
  | expr SUB expr           { $1 - $3}
  | var EQUAL expr {Hashtbl.add env $1 $3;0}
  | var {Hashtbl.find env $1}
;

var:NAME {$1}

Gives the following result when executed

a=1
0
b=2
0
a-b-5
-6
a=1-3
-3
a
1

which means that a=1, b=2, a-b-5 returns the right answer, but a=1-3 doesn't.

If I change the priority:

%left EQUAL
%right SUB

then I have

a=1-3
0
a
-2
a=1
0
b=2
0
a-b-5
4

so it seems I haven't understood the priority in Menhir. How do I fix it? Thanks!

and the line

| var EQUAL expr {Hashtbl.add env $1 $3;0}

I really need is

Hashtbl.add env $1 $3

but it seems that I have to define the type of main,and I can't define

%type<unit> main

so I have to write the ugly '0',how to fix it?

other files: lexer.mll:

{
open Parser
exception Eof
}
rule token = parse
  [' ' '\t']     { token lexbuf }     (* skip blanks *)
| ['\n' ]        { EOL }
 | ['0'-'9']+ as lxm { INT(int_of_string lxm) } 
| '-'            { SUB }
|['a'-'z']+ {NAME}
|'=' {EQUAL}
| eof            { raise Eof }

calc.ml

open Syntax

let _ =
  try
    let lexbuf = Lexing.from_channel stdin in
    while true do
      let result = Parser.main Lexer.token lexbuf in
      print_int result;
      print_newline ();
      flush stdout
    done
  with Lexer.Eof -> exit 0

makefile:

calc:lexer.cmo parser.cmo syntax.cmo calc.cmo 
    ocamlc -o calc lexer.cmo parser.cmo syntax.cmo calc.cmo

lexer.cmo:lexer.mll
    ocamllex lexer.mll
    
parser.cmo:parser.mly
    menhir parser.mly

syntax.cmo:syntax.ml
    ocamlc -c syntax.ml

calc.cmo:calc.ml
    ocamlc -c lexer.ml
    ocamlc -c parser.mli
    ocamlc -c parser.ml
    ocamlc -c calc.ml
1

There are 1 answers

0
jthulhu On

You are confusing priority and associativity of operators. Menhir declaration of tokens sets both of these: with %left SUB, you are declaring that something of the form a-b-c should be parsed as (a-b)-c rather than a-(b-c). Similarly, with %right EQUAL, you are saying that a=b=c should be parsed as a=(b=c), which is probably right in the sense that this the usual associativity of these operators in popular programming languages (ie. C).

However, the order of these declaration also matters: it indicates whether a=b-c should be parsed as a=(b-c) or (a=b)-c. By declaring SUB before EQUAL, EQUAL will have greater priority, so it's the latter parsing that will happen (which is not what you want). Likely, what you want is therefore the following:

%right EQUAL
%left SUB

Also, note that it doesn't make a lot of sense (from my point of view) to see a=b as an expression if it always returns 0. Maybe you meant to return the value of b instead, so that you can chain assignments?