How to match a whole word containing special characters?

2.7k views Asked by At

I have words to match using only a single pattern. The criteria are one of the following:

  • it contains a number or an underscore at the first letter, OR

  • at least one special character (excluding underscore) within the word:

Should match

3testData
3test_Data
_testData
_test3Data
%data%
test%BIN%data
te$t&$#@daTa

Should NOT match

test_Data3

So far, I have managed to match some of them through:

[\p{^Alpha}]\S+

Except for the words where special characters are inside the word

3testData
3test_Data
_testData
_test3Data
%data%
test%BIN%data
test%BIN%data
te$t&$#@daTa

2

There are 2 answers

2
The fourth bird On BEST ANSWER

If lookbehinds are supported, you could use an alternation to match either starting with an underscore or a digit OR in the other case matching zero or more times not a whitespace character, at least a special character using a character class followed by matching zero or more times not a whitespace character again.

(?<=\s|^)(?:[\d_]\S+|\S*[%@#$]\S*)(?=\s|$)

Regex demo

Explanation

  • (?<=\s|^) Positive lookbehind to assert what is on the left is either a whitespace character or the start of the string
  • (?: Start non capturing group
    • [\d_]\S+ Match a digit or an underscore followed by matching one or more times not a whitespace character
    • | Or
    • \S*[%@#$]\S* Match zero or more times not a whitespace character followed by matching what is specified in the character class and the match zero or more times not a whitespace character again
  • ) Close non capturing group
  • (?=\s|$) Positive lookahead to assert that what follows is a whitespace character or the end of the string
1
jean3xw On

if i get question right you search for a starting % and an ending % into a string. Assuming there's only one possible by string you could use indexOf and lastIndexOf looking like

function searchTagIn(symbol, str){ let chk=str.indexOf(symbol);
 if(  chk>-1){
  if(str.lastIndexOf(symbol)!=chk){
   return str.substring(chk,str.lastIndexOf(symbol);
  }
 }return;
}