I am trying to use VS Code's tokenization engine for grammar injections and I don't understand why some regular expressions fail.
For example, suppose I have the following text.
VS Code, TextMate grammars, and Oniguruma regular expressions.
Then, I want to match Oniguruma
using the following regex
(i.e., see demo):
(?=and\s+(Oniguruma)\s+regular)
Based on the demo above, the regular expression seems to match (capture?) what I want (i.e., see below).
However, when trying this in the context of VS Code grammars, it fails. More specifically, the ./syntaxes/some.test.injection.json
file contains:
{
"scopeName": "some.test.injection",
"injectionSelector": "L:text.html.markdown",
"patterns": [
{ "include": "#test" }
],
"repository": {
"test": {
"match": "(?=and\\s+(Oniguruma)\\s+regular)",
"captures": {
"1": { "name" : "some.test" }
}
}
}
}
Then, in package.json
I have:
{
// ...
"contributes": {
"grammars": [
{
"scopeName": "some.test.injection",
"path": "./syntaxes/some.test.injection.json",
"injectTo": ["text.html.markdown"]
}
]
},
// ...
}
Finally, the token color rule in settings.json
looks like this:
{
"editor.tokenColorCustomizations": {
"textMateRules": [
{ "scope": "some.test", "settings": { "foreground": "#dfd43b" } },
]
}
}
As you can see below, the token is not parsed:
However, the token gets parsed when I use the following regex
(i.e., see demo) instead:
(?<=and\s)(Oniguruma)(?=\s+regular)
As seen during the inspection of the editor token and scopes:
From the VS Code documentation (i.e., see below) I understand that I need to use Oniguruma
regular expressions:
TextMate grammars rely on Oniguruma regular expressions and are typically written as a plist or JSON. You can find a good introduction to TextMate grammars here, and you can take a look at existing TextMate grammars to learn more about how they work.
My question is twofold:
- Why does the first expression fail? Is it not a valid
Oniguruma
regular expression? - How can I test whether a regular expression is a valid
Oniguruma
regular expression?
VSCode uses TextMate as the tokenization engine, and TextMate uses the oniguruma engine for Regex.
Ruby 1.9+ uses the oniguruma engine. And Rubular uses ruby 2.5.9
I've been using Rubular to validate my VSCode TM grammars for a while and has never it failed once.