Regex to replace multiple spaces at the end of line with comma or add comma

61 views Asked by At

I'm using Regex and TextPad to clean up and prep sql scripts. I want to replace with comma or add comma:

  • 1 or more spaces at the end of each line
  • at end of each line or the last line (i.e. "end of file")

After spending a few hours researching and iterating I came up with the following which is close to the desired result I want but not perfect. How do I edit this to get the below desired result?

Find what: ' +$|(\n)|$(?![\r\n])'

Replace with: '\1\2,'

I have data that looks like

       dog  *(2 spaces)*
        cat    *(4 spaces)*
        bird*(no space)*
       rodent *(1 space)*
      fish*(no space)*

I want the result to be

    dog,
    cat,
    bird,
    rodent,
    fish,

My result is

        dog,
         cat,
         bird
    ,     rodent,
         fish,
2

There are 2 answers

2
Barmar On BEST ANSWER

I think you're overcomplicating it. Just match any number of spaces at the end of the line, and replace with comma.

Find: \s*$ Replace with: ,

0
bobble bubble On

In textpad or notepad++ \s*$ will work but it's worth to mention that if using this in another environment it can lead to undesired matches (regex101) and add an extra comma if there are spaces at the end of the line. The reason is that for example in cat it will match the spaces after cat (first match) plus a zero-length match at end of the line (second match).

Another potential issue of \s*$ can be read here: The RegEx that killed StackOverflow (blog)

If there are many spaces inside the text it can lead to a lot of backtracking (regex101 demo). This demo input needs about 7k steps just to remove some spaces at the end. A workaround to reduce steps can be to consume and capture the part up to the last non-white-space (if there is one).

^(.*\S)?\s*$

Replace with $1, (regex101 demo) what's captured by the first group which is set to optional for even matching only whitespace. This would get it down to a bit more than 100 steps.