Regexp give extra matching group

Asked by At

I have a content which is text mixed with json

blablabla  bla bla 
sdf
sdfsdfsdf {
    "glossary": [{
        "title": "example glossary",
        "GlossDiv": {
            "title": "S",
            "GlossList": {
                "GlossEntry": {
                    "ID": "SGML",
                    "SortAs": "SGML",
                    "GlossTerm": "Standard Generalized Markup Language",
                    "Acronym": "SGML",
                    "Abbrev": "ISO 8879:1986",
                    "GlossDef": {
                        "para": "A meta-markup language, used to create markup languages such as DocBook.",
                        "GlossSeeAlso": ["GML", "XML"]
                    },
                    "GlossSee": "markup"
                }
            }
        }
    },
    {
        "val":2
    }]
} dd dfsdfsdf
bla blablablabla

I want get json from the string, so I use this regexp

\{(.|\s)+\}

It gives me (checked it on https://regex101.com/):

  • Full match with my correctly found json
  • Empty group

I don't understand what causes the empty group to appear

1 Answers

0
Egan Wolf On

This empty group is last new line symbol captured by \s. Regex101 even shows you a warning that when you use something like this (.)+ regex, only the last occurrence of . is captured as group. You can use non capturing group \{(?:.|\s)+\} to get rid of group or use non capturing group and put second group around quantifier \{((?:.|\s)+)\} to have only one group.

Actually, don't do this. Please refer to this comment and comments below.