I am trying to run a code using BioPython that will allow me to iterate through the atoms of multiple proteins and remove all hydrogen atoms from them. To identify which atoms correspond to hydrogen I wrote the following code:
atom = residues[0].get_atoms()
atoms = list(atom)
atoms
[<Atom N>,
<Atom CA>,
<Atom C>,
<Atom O>,
<Atom CB>,
<Atom OG1>,
<Atom CG2>,
<Atom H1>,
<Atom H2>,
<Atom H3>,
<Atom HA>,
<Atom HB>,
<Atom HG1>,
<Atom HG21>,
<Atom HG22>,
<Atom HG23>]
How can I make it so that BioPython will return the name of the actual element of these atoms, such as "Carbon" or "Hydrogen"?
Note, I tried to use the following functions from the Bio.PDB.Atom module:
"get_name()",
"get_id"
"get_full_id"
They did not return the desired element name.
You may need to translate the elements yourself. You can use the
periodictablepackage.Note that
biopythonuses upper-case e.g.FEfor element names andperiodictableuses title-case e.g.Fe.I created two methods below: one that takes a symbol i.e.
symbol_to_english, and another that takes the atom i.e.atom_to_englishand passes its name tosymbol_to_english.