Python RegEx for all National Drug Codes (NDC 10 & 11) formats

Question

Python RegEx for all National Drug Codes (NDC 10 & 11) formats

642 views Asked by DanielBell99 At 10 March 2022 at 09:39

Goal: RegEx to fit many posible NDC 10 & 11 formats.

I've made a great start... NDC 10:

^[0-9][0-9][0-9][0-9]\-[0-9][0-9][0-9][0-9]\-[0-9][0-9]$

e.g. 1234-1234-12 Reference

However, I've since learnt there are other formats and 11 digits:

4-4-2
5-3-2
5-4-1
5-4-2 (11 digits)

How can I write one RegEx for all these possibilities?

Issues:

Optional 11th digit,
Moving hyphen

Original Q&A

There are 2 answers

Sean O'Malley On 21 September 2023 at 14:13

A way to do this within pandas DataFrame using lists parsing:

# add zeros to each portion of ndc10 to fit 5,4,2 ndc11 0-filled format
a = [f"0{x[0]}" if len(x[0]) == 4 else x[0] for x in list(ndcd.NDC10.str.split("-"))]
b = [f"0{x[1]}" if len(x[1]) == 3 else x[1] for x in list(ndcd.NDC10.str.split("-"))]
c = [f"0{x[2]}" if len(x[2]) == 1 else x[2] for x in list(ndcd.NDC10.str.split("-"))]

# rejoin sections to full ndc11
ndcd["NDC11"] = ["".join(x) for x in list(zip(a, b, c))]

**Wiktor Stribiżew** · Accepted Answer · 2022-03-10T09:50:09+00:00

You can use

^(?:\d{4}-\d{4}-\d{2}|\d{5}-(?:\d{3}-\d{2}|\d{4}-\d{1,2}))$

See the regex demo. Details:

^ - start of string
(?: - start of the first non-capturing group:
- \d{4}-\d{4}-\d{2} - four digits, -, four digits, -, two digits
- | - or
- \d{5}- - five digits, -
- (?: - start of the second non-capturing group:
  - \d{3}-\d{2} - three digits, -, two digits
  - | - or
  - \d{4}-\d{1,2} - four digits, - and one or two digits
- ) - end of the second non-capturing group
) - end of the first non-capturing group.
$ - end of string.

TechQA.

Python RegEx for all National Drug Codes (NDC 10 & 11) formats

There are 2 answers

Related Questions in PYTHON-3.X

Related Questions in REGEX

Related Questions in NDC

Popular Questions

Popular Tags

Trending Questions