I need to replace a very big list of predefined patterns. These patterns can contain only [a-zA-Z] characters, underscore is excluded. These patterns may appear in different forms : as a whole word or word preceded and/or followed by an undescore char '_'
example: I want replace FOO by BAR I use the 4 following instructions
$ cat > /tmp/try.pl
s/\bFOO\b/BAR/g;s/\bFOO_/BAR_/g;s/_FOO\b/_BAR/g;s/_FOO_/_BAR_/g;
$ perl -p /tmp/try.pl
FOO aaa_FOO FOO_bbb FOO.txt a-FOO-b.txt aaa_FOO_bbb dontchange_FOOQUX_dontchange
BAR aaa_BAR BAR_bbb BAR.txt a-BAR-b.txt aaa_BAR_bbb dontchange_FOOQUX_dontchange
It makes exactly what I want. But with thousands of words it takes time. If i can excluded the underscore from the word character class, i think i can use only one instruction :
s/\bFOO\b/BAR/g.
So is there any way to modify perl world character class or /b boundary definitions to exclude underscore character ?
You can just combine
\band_in a capture group(\b|_)and combine the regexes into one:This is using the functionality of your original substitution, but as ikegami points out in the comments, this will fail for for example
_FOO_FOO_. We can fix that using lookaround assertions:This is non-destructive towards our border characters and can therefore match two replacements separated by a single border character, such as in the case of
_FOO_FOO_.