I'm actually clearing my email contact database through the means of python scripting. However, I face some issue while doing so when I remove the duplicate the shape of the data frame still retain even though I check for that there is over 600 duplicates shown. You may refer to the attached codes.

I made use of .drop_duplicates function to remove the duplicate and .shape to show the size again.

import pandas as pd
import numpy as np
from pandas import DataFrame

data = pd.read_csv('ToBeSort.csv')
data.shape
data['Last Name'].duplicated()
dupes = data.drop_duplicates(subset=["Last Name"], keep=False)
print(dupes.shape)
dupes.to_csv('New.csv')

The duplicates still surface after export to new csv. The expected output for the new csv should not have any duplicates email in it.

0 Answers