Create check digit function

2.1k views Asked by At

I'm trying to create check digits and append them after the original UPCs. Here's the sample data

Because there are leading 0's, I have to read the data as strings first:

import pandas as pd                                                                 
upc = pd.read_csv("/Users/lee/Desktop/upc.csv", dtype = str)

Here's an example of the check digit algorithm:
If upc is 003459409000
step (1) 0 + 3*0 + 3 + 3*4 + 5 + 3*9 + 4 + 3*0 + 9 + 3*0 + 0 + 3*0 = 60
step (2) 60 mod 10 = 0
step (3) check digit = 0 (if it's not 0, then check digit = 10 - number in step 2)

Based on the algorithm, here's the code:

def add_check_digit(upc_str):  
    upc_str = str(upc_str)
    if len(upc_str) != 12: 
        raise Exception("Invalid length")

    odd_sum = 0
    even_sum = 0 
    for i, char in enumerate(upc_str): 
        j = i+1 
        if j % 2 == 0: 
            even_sum += int(char) 
        else:
            odd_sum += int(char) 
    total_sum = (even_sum * 3) + odd_sum 
    mod = total_sum % 10 
    check_digit = 10 - mod 
    if check_digit == 10: 
    check_digit = 0 
    return upc_str + str(check_digit) 

If I run this code, it gives correct check digit and appends this result to the end of the original UPC. For the example above, if I type:

add_check_digit('003459409000')

The output gives 13-digit UPC 0034594090000.

Now my questions are:

  1. This function works only for a single upc, i.e., I have to copy/paste each single upc and get the check digit. How do I create a function that works for a list of UPSs in a dataframe? Each result should return a 13-digit UPC with the check digits appended after the original UPC.

  2. The UPCs are read as strings. How do I apply the function to the UPCs? I suppose I should convert the strings to numbers somehow.

  3. After I get the new UPCs, how do I save the result in a csv file?

1

There are 1 answers

3
simpleApp On BEST ANSWER

data set up for me as I don't have CSV file, below step is the same as your

df = pd.read_csv("/Users/lee/Desktop/upc.csv", dtype = str)

data setup

import pandas as pd
df=pd.DataFrame({"upc_in_file":['003459409000','003459409001','003459409002']})
def add_check_digit(upc_str):  
    upc_str = str(upc_str)
    if len(upc_str) != 12: 
        raise Exception("Invalid length")

    odd_sum = 0
    even_sum = 0 
    for i, char in enumerate(upc_str): 
        j = i+1 
        if j % 2 == 0: 
            even_sum += int(char) 
        else:
            odd_sum += int(char) 
            total_sum = (even_sum * 3) + odd_sum 
            mod = total_sum % 10 
            check_digit = 10 - mod 
        if check_digit == 10: 
            check_digit = 0 
    return upc_str + str(check_digit) 

apply the above function to the upc column(the one which was read from file)

df['new_upc']=df['upc_in_file'].apply(add_check_digit)

now save the file!

df.to_csv("my_updated_upc.csv")

this will look like enter image description here