Append item to map with Yecc parser in Elixir/Erlang

248 views Asked by At

I am trying parsing a specific log file with Leex/Yecc in Elixir. After many hours I got the easiest scenario to work. However I want to go to the next step, but I cannot figure out how to do so.

First, here is an example of the log format:

[!] plugin error detected
 |  check the version of the plugin

My simple try was only with the first line, but multiple entries of them, as such:

[!] plugin error detected
[!] plugin error 2 detected
[!] plugin error 3 detected

That worked and gave me a nice map containing the text and the log line type (warning):

iex(20)> LogParser.parse("[!] a big warning\n[!] another warning")
[%{text: "a big warning", type: :warning},
 %{text: "another warning", type: :warning}]

That is perfect. But as seen above a log line can continue on a next line, indicated with a pipe character |. My lexer has the pipe character and the parser can understand it, but what I want is the next line to be appended to the text value of my map. For now it is just appended as a string in the map. So instead of:

[%{text: "a big warning ", type: :warning}, " continues on next line"]

I need:

[%{text: "a big warning continues on next line", type: :warning}]

I looked at examples on the net, but most of them have really clear 'end' tokens, such as a closing tag or a closing bracket, and then still it is not really clear to me how to add properties so the eventual AST is correct.

For completeness, here is my lexer:

Definitions.

Char          = [a-zA-Z0-9\.\s\,\[\]]
Word          = [^\t\s\.#"=]+
Space         = [\s\t]
New_Line      = [\n]
%New_Line      = \n|\r\n|\r
Type_Regular  = \[\s\]\s
Type_Warning  = \[!\]\s
Pipe          = \|

Rules.

{Type_Regular}  : {token, {type_regular,  TokenLine}}.
{Type_Warning}  : {token, {type_warning,  TokenLine}}.
{Char}          : {token, {char, TokenLine, TokenChars}}.
{Space}         : skip_token.
{Pipe}          : {token, {pipe, TokenLine}}.
{New_Line}      : skip_token.

Erlang code.

And my parser:

Nonterminals lines line line_content chars.
Terminals type_regular type_warning char pipe.
Rootsymbol lines.

lines -> line lines : ['$1'|['$2']].
lines -> line : '$1'.

line -> pipe line_content : '$2'.
line -> type_regular line_content : #{type => regular, text => '$2'}.
line -> type_warning line_content : #{type => warning, text => '$2'}.

line_content -> chars : '$1'.
line_content -> pipe chars : '$1'.

chars -> char chars : unicode:characters_to_binary([get_value('$1')] ++ '$2').
chars -> char : unicode:characters_to_binary([get_value('$1')]).

Erlang code.

get_value({_, _, Value}) -> Value.

If you got even this far, thank you already! If anyone could help out, even bigger thanks!

1

There are 1 answers

6
Dogbert On BEST ANSWER

I'd suggest adding a line_content rule to handle multiple lines separated by pipes and removing the rule line -> pipe line_content : '$2'..

You also have an unnecessary [] around '$2' in the lines clause and the single line clause should return a list to be consistent with the return value of the previous clause and so you don't end up with improper lists.

With these four changes,

-lines -> line lines : ['$1'|['$2']].
+lines -> line lines : ['$1'|'$2'].
-lines -> line : '$1'.
+lines -> line : ['$1'].

-line -> pipe line_content : '$2'.
 line -> type_regular line_content : #{type => regular, text => '$2'}.
 line -> type_warning line_content : #{type => warning, text => '$2'}.

 line_content -> chars : '$1'.
-line_content -> pipe chars : '$1'.
+line_content -> line_content pipe chars : <<'$1'/binary, '$3'/binary>>.

I can parse multiline text just fine:

Belino.parse("[!] Look at the error")
Belino.parse("[!] plugin error detected
 | check the version of the plugin")
Belino.parse("[!] a
 | warning
 [ ] a
 | regular
 [ ] another
 | regular
 [!] and another
 | warning")

Output:

[%{text: "Look at the error", type: :warning}]
[%{text: "plugin error detected  check the version of the plugin",
   type: :warning}]
[%{text: "a  warning ", type: :warning}, %{text: "a  regular ", type: :regular},
 %{text: "another  regular ", type: :regular},
 %{text: "and another  warning", type: :warning}]