Regex to interpret smtlib2 format

90 views Asked by At

I am trying to figure out a regex that could match results of a program that outputs in the smtlib format. Basically, my data is in the form:

 (define-fun X_1 () Int
    281)
 (define-fun X_71 () Int
    104)
 (define-fun X_90 () Int
    21)
 (define-fun X_54 () Int
    250)
etc.

Is it possible to write an expression that matches:

X_1 (...) 281
X_71 (...) 104
X_90 (...) 21
etc.

My current approach is to find individual occurences using \(define-fun[\w\s]+\), then for each occurence, remove (define-fun, Int, () and ), and then read the data as all that's left is something like 1 281, 71 104

I'm just wondering if there is a more elegant way of doing this?

1

There are 1 answers

0
Ryszard Czech On BEST ANSWER

Capture these substrings:

\(define-fun\s+(\S+).*\n\s*(\d+)\)

See regex proof.

EXPLANATION

--------------------------------------------------------------------------------
  \(                       '('
--------------------------------------------------------------------------------
  define-fun               'define-fun'
--------------------------------------------------------------------------------
  \s+                      whitespace (\n, \r, \t, \f, and " ") (1 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    \S+                      non-whitespace (all but \n, \r, \t, \f,
                             and " ") (1 or more times (matching the
                             most amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))
--------------------------------------------------------------------------------
  \n                       '\n' (newline)
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (                        group and capture to \2:
--------------------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \2
--------------------------------------------------------------------------------
  \)                       ')'