Parsing XML-type file with LPeg re module

Question

Parsing XML-type file with LPeg re module

310 views Asked by Augusto T. At 29 May 2015 at 20:24

I'm trying to learn LPeg's re module and it has been quite an interesting experience, specially since the official documentation is so nice.

However there are some topics that seem to be poorly explaned there. For example the named group capture construction: {:name: p :}.

Consider the following example, I don't understand why it does not match:

print(re.compile
  [[item <- ('<' {:tag: %w+!%w :} '>' item+ '</' =tag '>') / %w+!%w]]
  :match[[<person><name>James</name><address>Earth</address></person>]])

-- outputs nil

Can anyone help me understand what is going wrong here? I thought quite a bit about that, and it really seems like I'm missing something important.

Original Q&A

There are 1 answers

**wqw** · Accepted Answer · 2015-12-26T22:08:55+00:00

This is a late answer but you can try following pattern

result = re.compile[[
  item <- ({| %s* '<' {:tag: %w+ :} %s* '>' (item / %s* { (!(%s* '<') .)+ }) %s* '</' =tag '>' |})+
]]:match[[
<person>
    <name>
    James
    </name>
    <address>Earth</address>
</person>
]]

which uses tables captures to parse XML w/ whitespace for elements texts stripped

tag = "person"
[1] = {
  tag = "name"
  [1] = "James"
}
[2] = {
  tag = "address"
  [1] = "Earth"
}

TechQA.

Parsing XML-type file with LPeg re module

There are 1 answers

Related Questions in XML

Related Questions in LUA

Related Questions in PATTERN-MATCHING

Related Questions in LPEG

Popular Questions

Trending Questions