Regex for masking data

Asked by At

I am trying to implement regex for a JSON Response on sensitive data.

JSON response comes with AccountNumber and AccountName.

Masking details are as below.

accountNumber Before: 7835673653678365
accountNumber Masked: 783567365367****

accountName Before : chris hemsworth
accountName Masked : chri* *********

I am able to match above if I just do [0-9]{12} and (?![0-9]{12}), when I replace this, it is replacing only with *, but my regex is not producing correct output.

How can I produce output as above from regex?

2 Answers

0
Pushpesh Kumar Rajwanshi On Best Solutions

If all you want is to mask characters except first N characters, don't think you really a complicated regex. For ignoring first N characters and replacing every character there after with *, you can write a generic regex like this,

(?<=.{N}).

where N can be any number like 1,2,3 etc. and replace the match with *

The way this regex works is, it selects every character which has at least N characters before it and hence once it selects a character, all following characters also get selected.

For e.g in your AccountNumber case, N = 12, hence your regex becomes,

(?<=.{12}).

Regex Demo for AccountNumber masking

Java code,

String s = "7835673653678365";
System.out.println(s.replaceAll("(?<=.{12}).", "*"));

Prints,

783567365367****

And for AccountName case, N = 4, hence your regex becomes,

(?<=.{4}).

Regex Demo for AccountName masking

Java code,

String s = "chris hemsworth";
System.out.println(s.replaceAll("(?<=.{4}).", "*"));

Prints,

chri***********
2
The fourth bird On

If you match [0-9]{12} and replace that directly with a single asterix you are left with accountNumber Before: *8365

There is no programming language listed, but one option to replace the digits at the end is to use a positive lookbehind to assert what is on the left are 12 digits followed by a positive lookahead to assert what is on the right are 0+ digits followed by the end of the string.

Then in the replacement use *

If the value of the json exact the value of chris hemsworth and 7835673653678365 you can omit the positive lookaheads (?=\d*$) and (?=[\w ]*$) which assert the end of the string for the following 2 expressions.

Use the versions with the positive lookahead if the data to match is at the end of the string and the string contains more data so you don't replace more matches than you would expect.

(?<=[0-9]{12})(?=\d*$)\d

In Java:

(?<=[0-9]{12})(?=\\d*$)\\d
  • (?<=[0-9]{12}) Positive lookbehind, assert what is on the left are 12 digits
  • (?=\d*$) Positive lookahead, assert what is on the right are 0+ digits and assert the end of the string
  • \d Match a single digit

Regex demo

Result:

783567365367****

For the account name you might do that with 4 word characters \w but this will also replace the whitespace with an asterix because I believe you can not skip matching that space in one regex.

(?<=[\w ]{5})(?=[\w ]*$)[\w ]

In Java

(?<=[\\w ]{4})(?=[\\w ]*$)[\\w ]

Regex demo

Result

chri***********