Using Python's textx library I am having trouble capturing exactly what I want from a network configuration file. I can capture the interface and its attribute information, but I want textx to do most of the parsing and filtering work before passing the model back to my code. The issue I am running into is that textx parses all the data without filtering or capturing, what I would thought, is all of the targeted data, leaving some of the content unparsed or ignored.
I apologize for the convoluted language. Here is my current text to parse, assume there is other data surrounding the block of data presented, which the parser will ignore based on the grammar shown afterward:
Network configuration file
interface TenGigabitEthernet1/1/4
description Some interface
switchport trunk native vlan 232
switchport trunk allowed vlan 395,398
switchport mode trunk
switchport nonegotiate
no cdp enable
flowcontrol receive off
channel-group 1 mode active
lacp rate fast
!
interface Vlan1
no ip addr
grammar.tx
Config:
(
Junk
| interface=INTERFACE
)*
;
Junk[noskipws]:
!(INTERFACE)/[^\n]*\n/
;
INTERFACE[noskipws]:
'interface' intf=/.+[0-9]+[^\n]*//\n/-
(mode+=MODE
attributes+=ATTRIBUTES)#
;
ATTRIBUTES[noskipws]:
!MODE/\s+[^\n]*//\n/-
;
MODE[noskipws]:
/\s+/'switchport mode trunk'/[^\n]*//\n/-
;
The goal in here is to be able to capture the interface that is of mode trunk along with all its definitions - in other words, Config.interface[...] should be made of only, while it should ignore the other interface (Interface Vlan1):
interface TenGigabitEthernet1/1/4
description Some interface
switchport trunk native vlan 232
switchport trunk allowed vlan 395,398
switchport mode trunk
switchport nonegotiate
no cdp enable
flowcontrol receive off
channel-group 1 mode active
lacp rate fast
But I find instead that the code captures:
interface TenGigabitEthernet1/1/4
description Some interface
switchport trunk native vlan 232
switchport trunk allowed vlan 395,398
switchport mode trunk
I am deducing this is because of the following part of the grammar
ATTRIBUTES[noskipws]:
!MODE/\s+[^\n]*//\n/-
where it is instructing to capture all text that does not carry before it switchport mode trunk. So I have tried to include both patterns using the following expression:
ATTRIBUTES[noskipws]:
(!MODE/\s+[^\n]*//\n/-
|/\s+[^\n]*//\n/-!MODE)
Though this resulted in len(Config.interface) == 0
I'll continue reading the docs, but I am posting here in the event that someone can see where my mistake falls.
One possible solution... but not ideal
Config:
(
Junk
| interface=INTERFACE
)*
;
Junk[noskipws]:
!(INTERFACE)/[^\n]*\n/
;
INTERFACE[noskipws]:
'interface' intf=/.+[0-9]+[^\n]*//\n/-
(mode+=MODE
attributes+=PREATTRIBUTES
attributes+=POSTATTRIBUTES
)#
;
PREATTRIBUTES:
!MODE
/\s+[^\n]*//\n/-
;
POSTATTRIBUTES:
/\s+[^\n]*//\n/-
!MODE
;
MODE:
/\s+/'switchport mode trunk'/[^\n]*//\n/-
;
With this configuration, only the cards that have 'mode trunk' will be placed in the interface list, and all the attributes corresponding to the interface will be placed in the interface's attributes list.
Though I think it must be possible to make a single ATTRIBUTES definition.
It took me a bit of time, but the following grammar works to parse Cisco configuration files, which have indentation dependency, so the parser needs to consider spaces - I ran into a few surprises while parsing (mainly the
'\n'characters) - but with this approach, I was able to traverse the whole configuration and only select entries that carried theswitch mode trunkconfiguration entry.