I'm in a situation where I need to parse a legacy format. What I want to do is to write a parser that recognizes the format and transform it to an object which is easier to work with.
I managed to parse the input, the problem is when I want to transform it back to a string. To sum it up: When I pass the result of my parse()
as an argument to my compose()
method, it does not return a correct string.
Here's an output and source code. I'm a beginner when it comes to peg, is there anything I misunderstood? Notice that I have (126000-147600,3);
in my initial string while in a composed string it comes with -
in front of it.
Output:
********************************************************************************
-t gmt+1 -n GB_EN -p '39600-61200,0; (126000-147600,3); -(212400-234000,5); 298800; (320400); 385200-406800,0; 471600-493200,0; 558000-579600,0'
********************************************************************************
gmt+1 GB_EN
********************************************************************************
[{'end': '61200', 'interval': '0', 'start': '39600'},
{'end': '147600', 'interval': '3', 'start': '126000'},
{'end': '234000', 'interval': '5', 'inverted': True, 'start': '212400'},
{'start': '298800'},
{'start': '320400'},
{'end': '406800', 'interval': '0', 'start': '385200'},
{'end': '493200', 'interval': '0', 'start': '471600'},
{'end': '579600', 'interval': '0', 'start': '558000'}]
-t gmt+1 -n GB_EN -p '39600-61200,0; -(126000-147600,3); -(212400-234000,5); 298800; -(320400); 385200-406800,0; 471600-493200,0; 558000-579600,0'
Python source code:
from pypeg2 import *
from pprint import pprint
Timezone = re.compile(r"(?i)gmt[\+\-]\d")
TimeValue = re.compile(r"[\d]+")
class ObjectSerializerMixin(object):
def get_as_object(self):
obj = {}
for attr in ['start', 'end', 'interval', 'inverted']:
if getattr(self, attr, None):
obj[attr] = getattr(self, attr)
return obj
class TimeFixed(str, ObjectSerializerMixin):
grammar = attr('start', TimeValue)
class TimePeriod(Namespace, ObjectSerializerMixin):
grammar = attr('start', TimeValue), '-', attr('end', TimeValue), ',', attr('interval', TimeValue)
class TimePeriodWrapped(Namespace, ObjectSerializerMixin):
grammar = flag("inverted", '-'), "(", attr('start', TimeValue), '-', attr('end', TimeValue), ',', attr('interval', TimeValue), ")"
class TimeFixedWrapped(Namespace, ObjectSerializerMixin):
grammar = flag("inverted", '-'), "(", attr('start', TimeValue), ")"
class TimeList(List):
grammar = csl([TimePeriod, TimeFixed, TimePeriodWrapped, TimeFixedWrapped], separator=";")
def __str__(self):
for a in self:
print(a.get_as_object())
return ''
class AlertExpression(List):
grammar = '-t', blank, attr('timezone', Timezone), blank, '-n', blank, attr('locale'), blank, "-p", optional(blank), "'", attr('timelist', TimeList), "'"
def get_time_objects(self):
for item in self.timelist:
yield item.get_as_object()
def __str__(self):
return '{} {}'.format(self.timezone, self.locale)
if __name__ == '__main__':
s="""-t gmt+1 -n GB_EN -p '39600-61200,0; (126000-147600,3); -(212400-234000,5); 298800; (320400); 385200-406800,0; 471600-493200,0; 558000-579600,0'"""
p = parse(s, AlertExpression)
print("*"*80)
print(s)
print("*"*80)
print(p)
print("*"*80)
pprint(list(p.get_time_objects()))
print(compose(p))
I'm pretty sure this is a bug in
pypeg2
You can verify this with a simplified version of the pypeg2 example given here but using values similar to the ones you are using:
This demonstrates with a minimal example that the value of the flag variable (
inverted
) has no effect on the composition. As you have found for yourself, yourparse
is working as you want it.I've had a quick look through the code and this is where the compose is. The module is all written within the one
__init__.py
file and this function is recursive. As far as I can tell, the problem is that when the flag is False, the-
object is still passed into compose (at the bottom level of recursion) as astr
type and simply added into the composed string here.Update Isolated the bug to this line (1406), which unpacks the flag attribute incorrectly and will send the string
'-'
back tocompose()
and append it whatever the value of the property, which has typebool
.A partial workaround is to replace that line with
text.append(self.compose(thing, g))
similar to the clauses above (soAttribute
types are treated the same as they would be ordinrily once they are unpcked from a tuple), but you then hit this bug where optional attributes (flags are just a special case of typeAttribute
) are not composed properly where they are missing from the object.As a workaround for that, you could go to line 1350 of the same file and replace
with
I'm not sure this is a totally robust fix, but its a workaround that will get you going
Output
With those two workarounds / fixes applied to the
pypeg2
module file, the output you get fromprint(compose(p))
isas desired and you can continue to use the
pypeg2
module.