I need to parse Touchstone files (version 1.1, and version 2.0), but these have a strange rule in the syntax (see page 11 in the 1.1 spec, the top paragraph starting with Note.
So, I need to change the syntax rule from 'data points' to 'noise parameters', depending on the first float of the line. Like in this example:
! NETWORK PARAMETERS
2 .95 -26 3.57 157 .04 76 .66 -14
22 .60 -144 1.30 40 .14 40 .56 -85
! NOISE PARAMETERS (the down jump from 22 - linea above - to 4 - below, should trigger change of syntax)
4 .7 .64 69 .38
18 2.7 .46 -33 .40
(The lines starting with ! are comments and are optional)
There is no other parameter in the data file to help. (This only occurs in 'old' 1.x version of the spec. In the 2.0 version (which still has to be compatible with 1.*), a keyword was introduced).
How can I implement this in a single grammar? (I suspect the only solution is a line-by-line parser?)
This is probably a good case for using a parse action to detect when a new group of lines is found. It is possible to make a parser that dynamically redefines itself, but that is unnecessarily complicated here. For this case, we'll write a parser that reads all the lines of values, and then regroups them based on the "the first value starts a new group if it is less than or equal to the previous line's first value" rule.
First we need a parser that parses all the lines. Since line endings are going to be significant in this parser, we'll have to redefine the default whitespace characters at the start (and define an NL expression that we can insert in the parser, since we'll have to explicitly parse them now):
I wanted to just use pyparsing's numeric string matcher/converter defined in
pp.common.fnumber, but it does not accept floats that start with ".". So we define aRegexthat suits your numeric values, and add a converter to convert to ints or floats at parse time:With these pieces in place, we can define the parser for these lines, using
Groupto keep each line's values separate, and ignore your comments (I'm also using the relatively new[...]and[1, ...]notation in place ofZeroOrMoreandOneOrMore):At this point, if we use
parametersto parse your input, we get a single list of the lines of values, each line in a sub-group:Note that they are not parsed strings now, but have been converted to ints or floats.
Here is a railroad diagram for that parser, created using the following added lines:
To perform the regrouping, we'll add another parse action, this time on the
parametersexpression:Now if we parse using this parser:
we get:
And you can access the fields directly: