Regex for extracting name starting with Mr.|Mrs

4.9k views Asked by At

I was trying to write regex for identifying name starting with

Mr.|Mrs.

for example

Mr. A, Mrs. B.

I tried several expressions. These regular expressions were checked on online tool at pythonregex.com. The test string used is:

"hey where is Mr A how are u Mrs. B tt`"

Outputs mentioned are of findall() function of Python, i.e.

regex.findall(string)

Their respective outputs with regex are below.

Mr.|Mrs. [a-zA-Z]+  o/p-[u'Mr ', u'Mrs']

why A and B are not appearing with Mr. and Mrs.?

[Mr.|Mrs.]+ [a-zA-Z]+ o/p-[u's Mr', u'. B']

Why s is coming with Mr. instead of A?

I tried many more combinations but these are confusing so here are they. For name part I know regex has to cover more conditions but was starting from basic.

2

There are 2 answers

1
Avinash Raj On BEST ANSWER

Change your regex like below,

(?:Mr\.|Mrs\.) [a-zA-Z]+

DEMO

  1. You need to put Mr\., Mrs\. inside a non-capturing or capturing group , so that the | (OR) applies to the group itself.
  2. You must need to escape the dot in your regex to match a literal dot or otherwise, it would match any character. . is a special meta character in regex which matches any character except line breaks.

OR

Even shorter one,

Mrs?\. [a-zA-Z]+

? quantifier in the above makes the previous character s as an optional one.

1
WeaselFox On

There's a python library for parsing human names :

https://github.com/derek73/python-nameparser

Much better than writing your own regex.