I'm trying to run the following piece of python3 code:
import os
import psycopg2
import logging
# Set max attempts before giving up
MAX_ATTEMPTS = 5
# Set basic logging config to debug (i.e. log everything).
# By default, this will log te stdout (i.e. it will behave the same as print)
logging.basicConfig(level=logging.DEBUG)
# Grab DB url from env variable
database_url = os.environ.get('DATABASE_URL')
assert database_url is not None, 'DATABASE_URL env variable must be set to a postgres connection string.'
# Initiate psycopg2 and instantiate a cursor object
conn = psycopg2.connect(database_url)
cursor = conn.cursor()
# Define function to delete old records
def delete_old_records(cur):
# execute a query to delete old records. We're going to refer to this as the "delete" command
query = 'DELETE FROM my_table WHERE id NOT IN ( SELECT id FROM ( SELECT id FROM my_table ORDER BY id DESC LIMIT 1850 ) foo);'
cur.execute(query)
# Set variables to keep track of loop
successful = False
attempts = 0
# While not successful and max attempts not reached
while not successful and attempts < MAX_ATTEMPTS:
try:
# Attempt to delete old records
delete_old_records(cursor)
# Set successful to True if no errors were encountered in the previous line
successful = True
# Log a message
logging.info('Successfully truncated old records!')
# If some psycopg2 error happens
except psycopg2.Error as e:
# Log the error
logging.exception('Got exception when executing query')
# Rollback the cursor and get ready to try again
conn.rollback()
# Increment attempts by 1
attempts += 1
# If not able to perform operation after max attempts, log message to indicate failure
if not successful:
logging.warning(f'Was not successfully able to truncate logs after {MAX_ATTEMPTS} retries. '
f'Check logs for traceback (console output by default).')
Here's the problem:
The code executes successfully and without error. However, when we run the following command (hereafter referred to as the "count" command) on postico (Postgres GUI for Mac):
SELECT count(*) from my_table;
We get 1860 instead of 1850 (i.e. the rows were not deleted).
When running the delete command manually in psql or in postico, we get the correct results when running the COUNT command in psql or postico respectively. However, we get different results when running the command in ipython.
When I have an open connection to the db on ipython on computer A, and I run the delete command, and I open another connector to the db on ipython on computer B and run the count command, I see that the db row count has not changed i.e. still 1860, not cut to 1850.
I suspect caching/memoization but the I'm not really sure that my command really worked. Is there something in psycopg2, postico, or postgres itself that might be causing this? and how do we get around it? We don't see any clear cache on postico, or on psycopg2/postgres.
There is no caching involved. PostgreSQL does not cache query results.
You simply forgot to
COMMIT
the deleting transaction, so its effects are not visible in any concurrent transaction.