New versions of anaconda python 3.11 are failing with traceback when I try to update an existing dataframe using df.loc with a new key (i.e., trying to append new columns to an existing row in my df) :
Traceback (most recent call last):
File "my.py", line 293, in <module>
...
File "utils.py", line 961, in update_dataframe_with_new_data
my_df.loc[my_df['Name'] == x1, 'newName'] = x1
~~~~~~~~~~^^^^^^^^^^^^
File "/apps/anaconda/2024.02/lib/python3.11/site-packages/pandas/core/frame.py", line 3893, in __getitem__
indexer = self.columns.get_loc(key)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/apps/anaconda/2024.02/lib/python3.11/site-packages/pandas/core/indexes/range.py", line 418, in get_loc
raise KeyError(key)
KeyError: 'newName'
I am using python 3.11 from anaconda.
I do not manage this installation.
I found that the 2024.20 version of anaconda, python 3.11 exhibits this issue.
If I back up to the 2023.07 version of anaconda, then python 3.11 does not exhibit this issue.
So I expect that I am running into a unintentional feature that has been deprecated (but is not being caught and reported as such).
I would like to find a better solution to achieve the goal of updating a dataframe with new columns (keys). I do not necessarily know apriori what these new columns would be so I need to handle this dynamically vs. defining the additional columns when I first initialize the df.
Updates so far:
- Initialization does not help: my_df['newName'] = np.nan
- The later version of anaconda installed in my environment uses pandas version 2.1.4 whereas the older 2023.07 installation uses pandas version 1.5.3.