How to convert a grammar into a parsing algorithm using a table?

Question

How to convert a grammar into a parsing algorithm using a table?

194 views Asked by Lance At 27 June 2021 at 18:48

Say I have a grammar like this:

match start
  match first
    match rule a
    match rule b

match a
  match string, "a"

match b
  match string, "b("
  match optional
    match rule start
  match optional
    match many
      match string, ","
      match rule start
  match string, ")"

It compiles into JSON like this:

{
  "start": {
    "type": "rule",
    "name": "start",
    "children": [
      {
        "type": "match-first",
        "children": [
          {
            "type": "match-rule",
            "rule": "a"
          },
          {
            "type": "match-rule",
            "rule": "b"
          }
        ]
      }
    ]
  },
  "a": {
    "type": "rule",
    "name": "a",
    "children": [
      {
        "type": "match-string",
        "value": {
          "type": "string",
          "value": "a"
        }
      }
    ]
  },
  "b": {
    "type": "rule",
    "name": "b",
    "children": [
      {
        "type": "match-string",
        "value": {
          "type": "string",
          "value": "b("
        }
      },
      {
        "type": "match-optional",
        "children": [
          {
            "type": "match-rule",
            "rule": "start"
          }
        ]
      },
      {
        "type": "match-optional",
        "children": [
          {
            "type": "match-many",
            "children": [
              {
                "type": "match-string",
                "value": {
                  "type": "string",
                  "value": ","
                }
              },
              {
                "type": "match-rule",
                "rule": "start"
              }
            ]
          }
        ]
      },
      {
        "type": "match-string",
        "value": {
          "type": "string",
          "value": ")"
        }
      }
    ]
  }
}

It can match any of these strings theoretically:

a
b()
b(a)
b(a,a)
b(b(b()))
b(a,b(a,a,b(),a,a))

How do I generate an optimized parser (that simply returns true/false on whether or not the input string matches) out of this grammar JSON? My initial inkling is to try and do this like my recursive descent parser that just takes a grammar and a string and parses using the "uncompiled" grammar. But it was pointed out that this would be inefficient and is why "parser-generators" generate parsers specific to each grammar. How can that be done using this grammar provided in JSON form? Could it involve "parse tables" somehow? I am not really sure how this could be done. Ideally it would be done without recursion, somehow just looping and shifting the parser state appropriately.

I am not really sure how to begin because I don't see what the desired structure is supposed to be from the output of a generateParser(grammar) function.

function generateParser(grammar) {
  const table = []
  let x = 0
  for (let key in grammar) {
    let rule = grammar[key]
    for (let i = 0, n = rule.children.length; i < n; i++) {
      let pattern = rule.children[i]
      switch (pattern.type) {
        case 'match-first':
          addMatchFirst(pattern)
          break
      }
    }
  }

  return function parse(string) {
    // iterate through table and transition somehow?
  }

  function addMatchFirst(p) {
    table.push((state) => {
      state.stack.push(/* new state? */)
    })

    for (let i = 0, n = p.children.length; i < n; i++) {
      let pattern = p.children[i]
      switch (pattern.type) {
        case 'match-rule':
          addMatchRule(pattern)
          break
      }
    }
  }

  function addMatchRule(p) {
    table.push((state) => {
      state.stack.push(/* new state? */)
    })
  }
}

Looking at a parse table implementation in JavaScript seems to primitive and I don't see how to apply it.

If it's not possible to make this using a table, how else would you generate the parsing algorithm? All I am looking for is a true/false answer from the generated parser on whether or not the input string matches the grammar.

Original Q&A

There are 1 answers

**Michael Dyck** · Answer 1 · 2021-06-27T22:41:14+00:00

How you can parse depends on the semantics of your grammar notation. It has various constructs where there are alternatives:

For a "match first" line, the alternatives are its children.
For a "match optional" line, the alternatives are its child and the empty string.
For a "match many" line, the alternatives are the number of repetitions of its child.

If, at every point where there are alternatives, the semantics are that each alternative is "valid", then your notation basically defines a Context Free Grammar, so a large set of parsing options are available to you.

However, the wording of the "match first" line suggests that it means "try the 'children' in the order given and use the first that succeeds". If that's the case, then it looks like your notation is defining a Parsing Expression Grammar, so the parsing options are more limited. Packrat parsers seem popular.

It's unclear if you could use a table-based parsing algorithm such as you link to -- the Wikipedia article says "It is also possible to build LL parsers and LR parsers from parsing expression grammars", which implies that you could use an LL or LR parsing table. However, there's no citation for that statement, and I'm doubtful that it's true.

Did someone give you the generateParser code as a start at a parser implementation? It has a variable named table, but note that it isn't a table in the LL or LR sense, though there might be some similarities. At any rate, it appears to have some useful chunks of code.

TechQA.

How to convert a grammar into a parsing algorithm using a table?

There are 1 answers

Related Questions in JAVASCRIPT

Related Questions in ALGORITHM

Related Questions in PARSING

Related Questions in GRAMMAR

Related Questions in PARSER-GENERATOR

Popular Questions

Trending Questions