I have a readable stream of text input (including unicode characters from html) from which I am trying to extract information by specifying the structure in PEG.js and returning custom JSON objects from matched items.
I have the text input in the following format -
1. some input [tags]
(a) some text (b) some text
Ans. (b)
2. some input [tags]
(a) some text (b) some text
Ans. (b)
So after searching for available node.js lexical parsers out there I found PEG and tried this sample script in their online version -
start
= demo
_ "whitespace"
= [ \t\n\r]*
demo
= digits:[0-9]+."whitespace" "literal"+
Integer "integer"
= _ [0-9]+ { return parseInt(text(), 10); }
But I am getting error -
"Line 1, column 3: Expected "whitespace" but " " found."
So, how can I include whitespaces in my expression Or, are there any better ways / libraries to accomplish this with node.js
You are using
"whitespace"
but you should be using_
.You can think of
"whitespace"
as a comment that explains what_
is supposed to mean.peg.js should work fine in your case, as long as the input data is well formed.