Symbols within a Neo4j case-insensitive regex

250 views Asked by At

I'm storing e-mail addresses within some user nodes that I'm trying to match against, however the (?i) case-insensitive option doesn't appear to work when a + is added in the mail address. I use these to test, for example [email protected].

Setting up test nodes:

CREATE (uWithoutSymbol:USER {
    email: '[email protected]'
})
CREATE (uWithSymbol:USER {
    email: '[email protected]'
})

The Cypher queries:

MATCH (u:USER)
// This works
WHERE u.email =~ '(?i)[email protected]'
RETURN u

MATCH (u:USER)
// This returns nothing
WHERE u.email =~ '(?i)[email protected]'
RETURN u

I tried going for the case-insensitive unicode one: (?ui), but also no luck. Any ideas?

2

There are 2 answers

1
Bohemian On BEST ANSWER

The plus symbol '+' has special meaning in regex; escape it:

WHERE u.email =~ '(?i)john\\[email protected]'

The plus sign means "one or more of the previous term", so your attempt would match "[email protected]" or "[email protected]".

Technically, you should probably escape the dot too:

WHERE u.email =~ '(?i)john\\+business@doe\.com'

because without escaping the dot, it will match any character there, eg it will match "john+business@doeAcom" or "john+business@doe#com" too.

Thanks to @Stefan for pointing out the double-backslash needed to create a single literal backslash for the regex

0
Stefan Armbruster On

@Bohemian's answer addresses the issue: you need to quote the +. But in Cypher the backslash needs to be duplicated:

MATCH (u:USER)
WHERE u.email =~ '(?i)john\\[email protected]'
RETURN u

returns the desired result.