I'm trying to create check digits and append them after the original UPCs. Here's the sample data
Because there are leading 0's, I have to read the data as strings first:
import pandas as pd
upc = pd.read_csv("/Users/lee/Desktop/upc.csv", dtype = str)
Here's an example of the check digit algorithm:
If upc is 003459409000
step (1) 0 + 3*0 + 3 + 3*4 + 5 + 3*9 + 4 + 3*0 + 9 + 3*0 + 0 + 3*0 = 60
step (2) 60 mod 10 = 0
step (3) check digit = 0 (if it's not 0, then check digit = 10 - number in step 2)
Based on the algorithm, here's the code:
def add_check_digit(upc_str):
upc_str = str(upc_str)
if len(upc_str) != 12:
raise Exception("Invalid length")
odd_sum = 0
even_sum = 0
for i, char in enumerate(upc_str):
j = i+1
if j % 2 == 0:
even_sum += int(char)
else:
odd_sum += int(char)
total_sum = (even_sum * 3) + odd_sum
mod = total_sum % 10
check_digit = 10 - mod
if check_digit == 10:
check_digit = 0
return upc_str + str(check_digit)
If I run this code, it gives correct check digit and appends this result to the end of the original UPC. For the example above, if I type:
add_check_digit('003459409000')
The output gives 13-digit UPC 0034594090000
.
Now my questions are:
This function works only for a single upc, i.e., I have to copy/paste each single upc and get the check digit. How do I create a function that works for a list of UPSs in a dataframe? Each result should return a 13-digit UPC with the check digits appended after the original UPC.
The UPCs are read as strings. How do I apply the function to the UPCs? I suppose I should convert the strings to numbers somehow.
After I get the new UPCs, how do I save the result in a csv file?
data set up for me as I don't have CSV file, below step is the same as your
data setup
apply the above function to the upc column(the one which was read from file)
now save the file!
this will look like