TL;DR: The issue will be fixed in version 3.0 of TextX. The workaround is to use regex for matching escaped (\) characters, such as \n.
FULL QUESTION: Using TextX, I am parsing a homegrown mark-up language, where paragraph and line breaks are significant. I think I am missing a fundamental understanding when trying to match new lines: Why are "\n" and "\n\n" not working, while their regex counterparts /\n/ and /\n\n/ do?
NOTE: whitespace is redefined at parser level to exclude \n using ws=" \t".
import textx as tx
grammar = r"""
Root:
    content*=Content
;
Content:
    Text | ParagraphBreak | LineBreak
;
ParagraphBreak:
    paragraphbreak="\n\n"
    // paragraphbreak=/\n\n/
;
LineBreak:
    linebreak="\n"  // Will cause parsing error
    // linebreak=/\n/  // Will parse fine
;
Text[noskipws]:  // All text valid
    text=/[^\n]*/
;
"""
parser = tx.metamodel_from_str(grammar, ws=" \t")
source = "Line.\nBreak.\n\n"
parsed_source = parser.model_from_str(source)
print(parsed_source.content)
When running the above code on my system, using
- Python 3.10.1
- Poetry version 1.1.12, from poetry.lock:
- [[package]] name = "arpeggio", version = "1.10.2", ..., python-versions = "*"
- [[package]] name = "textx", version = "2.3.0", ..., python-versions = "*", [package.dependencies] Arpeggio = ">=1.9.0"
 
I get the following result:
With root of paths: /Users/[redacted]/Library/Caches/pypoetry/virtualenvs.
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/textx/model.py", line 291, in _parse
    return self.parser_model.parse(self)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 370, in _parse
    result = e.parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 789, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 945, in _parse
    parser._nm_raise(self, c_pos, parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 1718, in _nm_raise
    raise self.nm
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 485, in _parse
    result = p(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 423, in _parse
    parser._nm_raise(self, c_pos, parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 1718, in _nm_raise
    raise self.nm
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 409, in _parse
    result = e.parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 370, in _parse
    result = e.parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 370, in _parse
    result = e.parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 789, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 898, in _parse
    parser._nm_raise(self, c_pos, parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 1718, in _nm_raise
    raise self.nm
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 409, in _parse
    result = e.parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 370, in _parse
    result = e.parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 291, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 370, in _parse
    result = e.parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 789, in parse
    result = self._parse(parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 898, in _parse
    parser._nm_raise(self, c_pos, parser)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 1718, in _nm_raise
    raise self.nm
arpeggio.NoMatch: Expected '\n\n' or '\n' or EOF at position (1, 6) => 'Line.* Break.  '.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/Users/[redacted]/scratchpad/TextX/linebreaks.py", line 31, in <module>
    parsed_source = parser.model_from_str(source)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/textx/metamodel.py", line 615, in model_from_str
    model = self._parser_blueprint.clone().get_model_from_str(
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/textx/model.py", line 332, in get_model_from_str
    self.parse(model_str, file_name=file_name)
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/arpeggio/__init__.py", line 1516, in parse
    self.parse_tree = self._parse()
  File ".../[redacted]-py3.10/lib/python3.10/site-packages/textx/model.py", line 294, in _parse
    raise TextXSyntaxError(message=text(e),
textx.exceptions.TextXSyntaxError: None:1:6: error: Expected '\n\n' or '\n' or EOF at position (1, 6) => 'Line.* Break.  '.
I was expecting the same result as the regex version, which is:
[<textx:Text instance at 0x10129bc40>, <textx:LineBreak instance at 0x101298040>, <textx:Text instance at 0x101298130>, <textx:ParagraphBreak instance at 0x10129aec0>]
 
                        
It is the problem addressed in the current development version. Please see this textX issue.
The fix will be a part of the upcoming textX 3.0 release.