Regex to identify Aadhaar Number

613 views Asked by At

I have been facing some challenges in writing regex to search Aadhaar number in DLP.

Actually the inbuilt pattern is as below :

\b[2-9][0-9]{11}\b
\b[2-9][0-9]{3} [0-9]{4} [0-9]{4}\b

However above pattern works fine but it gives many false cases by reading digits in vertical manner also. Below will be treated as Aadhaar by reading it vertically which I don’t want it to happen.

Eg.

2355(New Line)
2345(New Line)
7868

Also I want it to restrict search for 12 digits only , if digits are 13 or 11 then do not count it.

I tried below please suggest if it is fine to search entire document if it has Aadhaar number

^[2-9][0-9]{3}\s[0-9]{4}\s[0-9]{4}$
2

There are 2 answers

1
Ofer Calvo On

Your RegEx looks right to me.

But keep in mind that your solution is for multi-line search (^ and $ match start/end of the line).

You can experiment with it in this regex101 share link.

Also, you can check this geeksforgeeks.org post for more details.


After reading the comment below I revised my answer to this:

\b[2-9][0-9]{3}[^\S\r\n][0-9]{4}[^\S\r\n][0-9]{4}\b

I used Greg Bacon's answer for matching whitespace but not newlines and combined it with yours. Check the updated regex101 share link to test it furthermore.

Good luck.

1
Gowtham Akshaya Kumaran On

Regex - \b(\d{4}\s\d{4}\s\d{4})\b|\b(\d{12})\b|\b(\d{4}-\d{4}-\d{4})\b

The regex pattern matches the below formats, 0000 0000 0000 0000-0000-0000 000000000000

this will work for numbers with 12 digits.