Regex Lookahead/behind to find character unless followed by the same

662 views Asked by At

I'm really not good with Regex and have been messing about to achieve the following all morning:

I want to find unicode characters ie "\00026" in an SQL string before saving to the database and escape the "\", by replacing it with "\" unless it already has two "\" characters.

\\(?=[0])(?<![\\])

Is what I have written, which as I understand it does:

find the "\" character, positive look ahead for a "0", and look behind to check it isn't preceded by a "\"

But it's not working, so clearly I have misunderstood!

I can shorten it to \\(?=[0])

But then I get the "\" before the 0, even if it is preceded by another "\"

So how do I do:

Replace("\00026", "regex", "\\") to get "\\00026"
AND ensure that 
Replace("\\00026", "regex", "\\") also gives "\\00026"

All help much appreciated!

EDIT:

This must parse an entire string and replace all occurrences, not just the first as well - just to be clear. Also I am using VB.net if it makes much difference.

2

There are 2 answers

1
Wiktor Stribiżew On BEST ANSWER

Let me explain why your regex does not work.

  • \\ - Matches \
  • (?=[0]) - Checks (not matches) if the next character is 0
  • (?<![\\]) - Checks (but not matches) if the preceding character (that is \) is not \.

The last condition will always fail the match, as \ is \. So, not much sense, right?

If you want to match / in /000xx whole strings (e.g. separated with spaces), where x is any digit, you can use

\B(?<!/)/(?!/)(?=000\d{2})

See demo (go to Context tab)

To match the string even in context like w/00023, you can remove \B:

(?<!/)/(?!/)(?=000\d{2})

If you do not care about 0s, but just any digits:

(?<!/)/(?!/)(?=\d)

And in case you have \ (not /), just replace / with \\ in the above regular expressions.

0
karthik manchala On

You can use the following regex:

(?<!/)/(?=0)

And replace with //

See DEMO