I am trying to make a morgan fingerprint. I have the file upload however when I try to have the code read the SMILES it tells me there is a problem with understanding them.
Here is my code:
# Function to generate morgan fingerprints from SMILES
def generate_morgan_fingerprint(smiles):
mol = Chem.MolFromSmiles(smiles)
if mol:
fp=AllChem.GetMorganFingerprintAsBitVect(mol,2,nBits=2048)
return list(fp)
return [0]*2048
#Read the CSV file
df = pd.read_csv('/Users/Desktop/trainining_data.csv')
#Provide the input data part 2
#Define 'activity' as 'activity' column in the dataframe
activity=df['activity']
#Display the first few rows of the activity column
print(activity.head())
#Define 'SMILES' as the 'SMILES' column in the dataframe
smiles = df['SMILES']
data = {'SMILES','activity'}
df=pd.DataFrame(data)
# Calculate Morgan fingerprints
df['morgan_fp']= df['SMILES'].apply(generate_morgan_fingerprint)
X=np.arrayy(df['morgan_fp'].to_list())
y=df['activity'].values
I have already tried to change df['SMILES'] to smiles and that did not work. I've excluded data, and df, both of which did not help.