Parse string with unescaped characters

263 views Asked by At

I have one library which supports some kind of custom language. The parser is written using scala RegexParsers. Now I'm trying to rewrite our parser using fastparse library to speedup our engine. The question is: Is it possible to parse properly params inside our pseudolanguage function?

Here is an example:

$out <= doSomething('/mypath[text() != '']', 'def f(a) {a * 2}', ',') <= $in

here is a function doSomething with 3 params:

  1. /mypath[text() != '']
  2. def f(a) {a * 2}
  3. ,

I'm expecting to get a tree for the function with params:

Function(
    name = doSomething
    params = List[String](
        "/mypath[text() != '']",
        "def f(a) {a * 2}",
        ","
    )
)

What I do:

val ws = P(CharsWhileIn(" \r\n"))
def wsSep(sep: String) = P(ws.? ~ sep ~ ws.?)
val name = P(CharsIn('a' to 'z', 'A' to 'Z'))
val param = P(ws.? ~ "'" ~ CharPred(_ != '\'').rep ~ "'" ~ ws.?)
val params = P("(" ~ param.!.rep(sep = wsSep(",")) ~ ")")
val function = P(name.! ~ params.?).map(case (name, params) => Function(name, params.getOrElse(List())))

The problem here that the single quotes represent a String in my code, but inside that string sometimes we have additional single quotes like here:

/mypath[text() != '']

So, I can't use CharPred(_ != '\'') in my case

We also have a commas inside a Strings like in 3rd param

This is works somehow using scala parser but I can't parse the same using fastparse

Does anyone have ideas how to make the parser work properly?

Update

Got it! The main magic is in val param

object X {

  import fastparse.all._

  case class Fn(name: String, params: Seq[String])

  val ws = P(CharsWhileIn(" \r\n"))
  def wsSep(sep: String) = P(ws.? ~ sep ~ ws.?)
  val name = P(CharIn('a' to 'z', 'A' to 'Z').rep)
  val param = P(ws.? ~ "'" ~ (!("'" ~ ws.? ~ ("," | ")")) ~ AnyChar).rep  ~ "'" ~ ws.?)
  val params = P("(" ~ param.!.rep(sep = wsSep(",")) ~ ")")
  val function = P(name.! ~ params.?).map{case (name, params) => Fn(name, params.getOrElse(Seq()))}
}


object Test extends App {
  val res = X.function.parse("myFunction('/hello[name != '']' , 'def f(a) {mytest}', ',')")
  res match {
    case Success(r, z) =>
      println(s"fn name: ${r.name}")
      println(s"params:\n {${r.params.mkString("\n")}\n}")
    case Failure(e, z, m) => println(m)
  }
}

out:

name: myFunction
params:
'/hello[name != '']' 
'def f(a) {mytest}'
','
0

There are 0 answers