Scala rep separator for specific area of text

150 views Asked by At

Imaging i've got following:

--open
Client: enter
Nick
Age 28
Rosewell, USA

Client: enter
Maria
Age 19
Cleveland, USA
--open--

I need a result close to the following: List(List(Nick, Age 28, Rosewell), List(Maria, Age19, Cleveland))

It can be as many clients inside open body as you can imagine, so the list can have any size, it's not fixed.

I was trying to make with the help of following:

repsep(".*".r , "Client: enter" + lineSeparator)

In this case all i can parse it this line List((Client: enter)), how to make sure that you work with the same piece of parse text?

1

There are 1 answers

0
Alexis C. On BEST ANSWER

I guess you are using the RegexParsers (just note that it skips white spaces by default). I'm assuming that it ends with "\n\n--open--" instead (if you can change that otherwise I'll show you how to modify the repsep parser). With this change we see that the text has the following structure:

  • each client is separated by the text "Client: enter"
  • then you need to parse each line after that is non-empty, separated by a carriage return
  • if you have an empty line, parse the two line separators and repeat step 2 if possible otherwise it means that we reach the end of the input


Then the implementation of the parser is straightforward:

object ClientParser extends RegexParsers {

  override def skipWhitespace = false

  def lineSeparator = "\n"
  def root = "--open" ~> lineSeparator ~> rep(client) <~ "--open--"
  def client = ("Client: enter" ~ lineSeparator) ~> repsep(".+".r, lineSeparator) <~ rep(lineSeparator)
}

Running it with:

--open
Client: enter
Nick
Age 28
Rosewell; USA

Client: enter
Maria
Age 19
Cleveland; USA

--open--

You get:

[12.9] parsed: List(List(Nick, Age 28, Rosewell; USA), List(Maria, Age 19, Cleveland; USA))