How can I test if an expression is valid for TextMate grammars in VS Code?

1.1k views Asked by At

I am trying to use VS Code's tokenization engine for grammar injections and I don't understand why some regular expressions fail.

For example, suppose I have the following text.

VS Code, TextMate grammars, and Oniguruma regular expressions. 

Then, I want to match Oniguruma using the following regex (i.e., see demo):

(?=and\s+(Oniguruma)\s+regular)

Based on the demo above, the regular expression seems to match (capture?) what I want (i.e., see below).

demo matching

However, when trying this in the context of VS Code grammars, it fails. More specifically, the ./syntaxes/some.test.injection.json file contains:

{
    "scopeName": "some.test.injection",
    "injectionSelector": "L:text.html.markdown",
    "patterns": [
        { "include": "#test" }
    ],
    "repository": {
        "test": {
            "match": "(?=and\\s+(Oniguruma)\\s+regular)",
            "captures": {
                "1": { "name" : "some.test" }
            }
        }
    }
}

Then, in package.json I have:

{
    // ...
    "contributes": {
        "grammars": [
            {
                "scopeName": "some.test.injection",
                "path": "./syntaxes/some.test.injection.json",
                "injectTo": ["text.html.markdown"]
            }
        ]
    },
    // ...
}

Finally, the token color rule in settings.json looks like this:

{
    "editor.tokenColorCustomizations": {
        "textMateRules": [
            { "scope": "some.test", "settings": { "foreground": "#dfd43b" } },
        ]
    }
}

As you can see below, the token is not parsed:

enter image description here

However, the token gets parsed when I use the following regex (i.e., see demo) instead:

(?<=and\s)(Oniguruma)(?=\s+regular)

As seen during the inspection of the editor token and scopes:

enter image description here

From the VS Code documentation (i.e., see below) I understand that I need to use Oniguruma regular expressions:

TextMate grammars rely on Oniguruma regular expressions and are typically written as a plist or JSON. You can find a good introduction to TextMate grammars here, and you can take a look at existing TextMate grammars to learn more about how they work.

My question is twofold:

  1. Why does the first expression fail? Is it not a valid Oniguruma regular expression?
  2. How can I test whether a regular expression is a valid Oniguruma regular expression?
2

There are 2 answers

0
ghaschel On

VSCode uses TextMate as the tokenization engine, and TextMate uses the oniguruma engine for Regex.

Ruby 1.9+ uses the oniguruma engine. And Rubular uses ruby 2.5.9

I've been using Rubular to validate my VSCode TM grammars for a while and has never it failed once.

0
William Cole On

I'm pretty sure your regex is being overridden by another one that is also present in the .tmLanguage.json file.
In order to check this, do the following:
in the file where you write regex (it is assumed that other patterns are also located in it) find(Ctrl + F) the following textmate scope: "text.html.markdown" (as shown in you in the screenshot), then with your regex completely replace the one that is registered for this scope and change the name to "some.test", then reload VSCode.