I am trying to detect non printable characters in a string ('\n', '\r', etc.) and insert a single backslash before them. So, for example if I have a string "Hello\nWorld", I want it to be "Hello\\nWorld". I have a code example that should do it, but it inserts a double backslash ('\\'), so the result is "Hello\\\nWorld". Is there a way to insert a single backslash in a string?
expression = Regex.Replace(expression, @"\p{Cc}", m =>
{
int code = m.Value[0];
return code < 32
? @"\" + $"{Convert.ToChar(code)}"
: Convert.ToChar(code).ToString();
});
If you don't want the TLDR, skip to the end..
When you write this:
The compiler turns the
\ninto a newline character giving you:When you write this:
The compiler turns the
\\into a single backslash character, giving you:When you write this verbatim string:
The leading @ turns off compiler conversions of any slashed characters so you get:
When you look at a string in the debugger tooltip or autos/locals window it shows you non-verbatim strings. i.e. it shows you the string you would have to paste into your source code to get the string you want output:
If you want to look at how the string actually would appear if you e.g. wrote it to a file and opened it in Notepad, click the magnifying glass next to the string value
If you edit the value by writing into the tooltip or the autos window, and you write a verbatim string by preceding it with an
@:Remember that it will go back to being a non-verbatim string when the debugger tooltip shows it to you next:
Here there are now 4 slashes because we edited it by making a verbatim string that had 2 slashes, and 2 real-slashes double up to 4 sourcecode-slashes. This is so that if you pasted it into code as a non-verbatim string, the compiler would convert those 4 slashes down to 2 slashes when compiling..
Hopefully you're now down with "compiler slashes". Here's the next thing to get on board with..
The regex engine is also a compiler of sorts, that also does these conversions.
When you have a regex of "a word character":
You need to get past the C# compiler conversion first - the C# compiler conversion happens at compile time, but the Regex engine conversion happens at runtime
If you just write this:
The compiler will try and convert that
\wand choke on it because it doesn't have a slash conversion for\wlike it does for\newline or\tabThis means to get the regex engine to see
\wyou need to do either:Both of these become
\wby the C# compiler so that's what the Regex engine seesSome slashed-characters have meaning to both the compiler and the regex engine
The regex engine can understand either
\n(2 chars: literally a slash followed by ann) or a newline(1 char, character 10 in the ascii table) so to get Regex to hunt for a newline you could:So bear in mind this two step conversion. It's probably easiest to use @ strings to turn off compiler conversions and then your slashes get through to the regex engine as you wrote them in the source. If you need to get a
"through to Regex, write""And also note that in recent visual studio, strings inside a regex get extra helpful syntax highlighting for what the regex engine sees:
Now that we have all that out of the way, and you appreciate the multiple levels of conversion going on, hopefully you can appreciate that you can't do what you're asking with Regex. There isn't any notion that the following string:
Which, in source code would be either:
Could "have a slash placed in front of the newline" and pop back out as
\nbecause it isn't annin the string. The string "Hello World" with some whitespace between the words doesn;t contain annat all, anywhereThe compiler has essentially done:
You cannot invert that with:
A string of "slash-newline" is not "slash-n"
The only reversion is:
There aren't slash codes for everything you'll find. About the only thing I guess you could do with your current approach would be something like:
This will take some string like:
And turn it into
If you want it to be
\nyou'll have to code for it (and all the other slash-whatevers) specifically by having a big table of replacements:Table courtesy of https://www.tutorialspoint.com/csharp/csharp_character_escapes.htm