python regex doesn't match dig output

495 views Asked by At

I'm trying to parse some dig output (yes I know about dnspython but it doesn't satisfy my requirements) and have a problem to find a matching regex for my usecase. I'm trying to find all lines from the dig output with IN and NS in them, and example output looks like this:

stackexchange.com.  300 IN  NS  ns1.serverfault.com.
stackexchange.com.  300 IN  NS  ns2.serverfault.com.

i tried:

if 'NS' in line:

and I found some relevant lines, sadly I have some false positives for NSEC DNS entries, e.g.:

CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 86400 IN NSEC3 1 1 0 - CK0Q5NFFJS5FUB0F2DNA098SBN0O663V NS SOA RRSIG DNSKEY NSEC3PARAM

also shows up in my output. I know about the \s escape which should match any kind of whitespace and tab, howsoever my regex is failing. I currently have

for line in output:
    regex = re.compile(r'IN\sNS\s')
    if regex.match(line):
        print(line)

But it isn't working. can you help me come up with a regex that doesn't produce false positives? Any kind of help is appreciated. Thanks in advance

2

There are 2 answers

1
Stefan Seemayer On BEST ANSWER

You want search, not match.

The match command only matches from the beginning of the string.

Additionally, if the amount of whitespace between IN and NS is flexible, you can quantify one or more matches with +.

Your code will be faster if you move the compilation of the regex out of the loop and only compile once:

regex = re.compile(r'IN\s+NS\s')
for line in output:
    if regex.search(line):
        print(line)
0
Mark On

You need to write \s*

\s only matches one character of whitespace and you have multiple spaces or perhaps a tab that need matching