I have a problem with the parsing of arithmetic expressions using pyparsing. I have the following grammar:
numeric_value = (integer_format | float_format | bool_format)("value*")
identifier = Regex('[a-zA-Z_][a-zA-Z_0-9]*')("identifier*")
operand = numeric_value | identifier
expop = Literal('^')("op")
signop = oneOf('+ -')("op")
multop = oneOf('* /')("op")
plusop = oneOf('+ -')("op")
factop = Literal('!')("op")
arithmetic_expr = infixNotation(operand,
[("!", 1, opAssoc.LEFT),
("^", 2, opAssoc.RIGHT),
(signop, 1, opAssoc.RIGHT),
(multop, 2, opAssoc.LEFT),
(plusop, 2, opAssoc.LEFT),]
)("expr")
I would like to use this to parse arithmetic expressions, e.g.,
expr = "9 + 2 * 3"
parse_result = arithmetic_expr.parseString(expr)
I have two problems here.
First, when I dump the result, I get the following:
[['9', '+', ['2', '*', '3']]]
- expr: ['9', '+', ['2', '*', '3']]
- op: '+'
- value: ['9']
The corresponding XML output ist:
<result>
<expr>
<value>9</value>
<op>+</op>
<value>
<value>2</value>
<op>*</op>
<value>3</value>
</value>
</expr>
</result>
What I would like to have is, that ['2', '*', '3']
shows up as expr
, i.e.,
<result>
<expr>
<value>9</value>
<op>+</op>
<expr>
<value>2</value>
<op>*</op>
<value>3</value>
</expr>
</expr>
</result>
However, I am not sure ho to use the setResultName()
to achieve this.
Second, unfortunately, when I want to iterate over the results, I obtain strings for the simple parts. Hence, I use the XML "hack" as a workaround (I got the idea from here: `pyparsing`: iterating over `ParsedResults` Is there a better method now?
Best regards Apo
I have one further little question on how to parse the results. My first attempt was to use a loop, like e.g.
def recurse_arithmetic_expression(tokens):
for t in tokens:
if t.getResultName() == "value":
pass # do something...
elif t.getResultName() == "identifier":
pass # do something else..
elif t.getResultName() == "op":
pass # do something completely different...
elif isinstance(t, ParseResults):
recurse_arithmetic_expression(t)
However, unfortunately t
can be a string or int/float. Hence, I get an exception when I try to call getResultName.
Unfortunately, when I use asDict
, the order of the tokens is lost.
Is it possible to obtain an ordered dict and iterate over its keys with something like
for tag, token in tokens.iteritems():
where tag
speficies the type of the token (e.g., op, value, identifier, expr...
) and token is the corresponding token?
If you want pyparsing to convert numeric strings to integers, you can add a parse action to have that done at parse time. OR, use the predefined integer and float values defined in pyparsing_common (a namespace class imported with pyparsing):
For your naming issue, you can add parse actions to get run at each level of infixNotation - in the code below, I've add a parse action that just adds the 'expr' name to the current parsed group. You'll also want to add '*' to all of your ops so that repeated operators get the same "keep all, not just the last" behavior for results names:
See how these results look now:
gives:
Note: I generally discourage people from using asXML, as it has to do a fair bit of guessing to create its output. You are probably better off navigating the parsed results manually. Also, look at some of the examples on the pyparsing wiki Examples page, especially SimpleBool.py, which uses classes for the per-level parse actions used in infixNotation.
EDIT::
At this point, I really want to dissuade you from continuing on this path of using results names to guide evaluation of the parsed results. Please look at these two methods for recursing over the parsed tokens (note that the method you were looking for is
getName
, notgetResultName
):eval_parsed_expr
relies on the structure of the parsed tokens, rather than on result names. For this limited case, the tokens are all binary operators, so for each nested structure, the resulting tokens are "value [op value]...", and the values themselves could be ints, floats, or nested ParseResults - but never strs, at least not for the 4 binary operators I've hard-coded in this method. Rather than try to special-case yourself to death to handle unary ops and right-associative ops, please look at how this is done in eval_arith.py (http://pyparsing.wikispaces.com/file/view/eval_arith.py/68273277/eval_arith.py), by associating evaluator classes to each operand type, and each level of the infixNotation.