I have an extremely weird situation on my machine debugging my application.
macOS: 14.4 (23E214) - 16GB RAM PyCharm: PyCharm 2023.3.5 (Community Edition) Python3.10
PyCharm was reinstalled twice without an effect on the outcome.
I am locally debugging with databricks.connect fetching some data (batch 1000 rows one after another) and then I am transforming the data into local objects (330) each has like 20 fields - so nothing memory intense etc. Activity Monitor also shows no abnormalities.
The fun part are those lines:
def set_value_fast(self, twin_id: str, key: str, value: str):
if self.row_item_dict is None:
dummy_row = create_row_from_schema(self._schema)
self.row_item_dict = dummy_row.asDict()
self.row_item_dict[self._primary_key] = twin_id
self.row_item_dict[key] = value # self._assign_value_based_on_data_type(key, value)
else:
self.row_item_dict[key] = value # self._assign_value_based_on_data_type(key, value)
It does not crash and creates my dictionary as expected. If I am using the commented function instead - please do not comment an the date conversion I have just tried millions of things as I thought that this is the reason for the issue somehow.
self.row_item_dict[key] = self._assign_value_based_on_data_type(key, value)
def _assign_value_based_on_data_type(self, key, value):
for time_stamp_column in self._time_stamp_columns:
if key == time_stamp_column:
try:
print(f"twinId: {self.twin_id} key: {key} value: {value}")
if value is None or value == '':
return None
if len(value.split('.')) > 1 and len(value.split('.')[1]) > 3:
# If microseconds are present, truncate to milliseconds
date_string = '.'.join(value.split('.')[:2])[:23] # Truncate to milliseconds
# Convert date string to datetime
datetime_obj = datetime.strptime(date_string, '%Y-%m-%dT%H:%M:%S.%f')
# Set timezone to UTC
datetime_obj = datetime_obj.replace(tzinfo=timezone.utc)
return datetime_obj
except Exception as e:
print("An error occurred:", e)
return value
when starting with debug this immediately happens on the output:
twinId: XXXX#ZZZZ key: createdAt value: 2021-04-03T02:06:57.606Z
/usr/local/Cellar/[email protected]/3.10.13_2/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 2 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
When I just run the code without a debugger then I get all objects converted without any issues.
Can anyone give me a hint about what could cause this behavior as I really have no idea anymore what to do.
I hope I provided all the necessary information, if you require more info please let me know.
Thanks, Andre