PyGreSQL upsert() Help Needed

110 views Asked by At

I am using CentOS-6 with Python 3.8, PostgreSQL 12 and PyGreSQL 5.2.2.

With older versions, I had a function like the following. Back then an update raised an exception if the row didn't already exist, so upon exception, I would insert the row of data.

import pg
db = pg.DB(myDatabaseName, myport)
try:
   db.update(tableName, None, **rowData)
except:
   db.insert(tableName, None, **rowData)

With the versions I'm using now, an update does NOT raise an exception if the row doesn't exist, so my code no longer functions correctly. I'm trying to replace that functionality by using upsert, but am running into problems.

My table has 50 columns. 1 column called phy is the primary key.

I thought the following upsert command would update the values of the columns I passed in the rowData variable and leave the other columns alone. In the case that there wasn't a row of data for this primary key, it would insert a new row of data.

rowData = {'phy': 1234, 'col2': 'test', 'col3': 'test'}
db.upsert(tableName, rowData)

The upsert method is updating EVERY column in the table. It is doing a command like this:

INSERT INTO tableName AS included ('phy', 'col2', col3') VALUES (1234, 'test', 'test) ON CONFLICT (phy) DO update set "col4" = excluded."col4", "col5"=excluded."col5" etc with all 50 columns listed here

Is there a way to configure the upsert method to only update the values of the columns passed in? If not, what is the best way to replace my original update or insert functionality?

Any help is appreciated!

1

There are 1 answers

1
Cito On BEST ANSWER

Regarding your first code snippet, it should work if you do it the other way around:

try:
    db.insert(tableName, None, **rowData)
except pg.IntegrityError:
    db.update(tableName, None, **rowData)

You can also simplify this:

try:
    db.insert(tableName, rowData)
except pg.IntegrityError:
    db.update(tableName, rowData)

Maybe this behaved differently with older PostgreSQL versions where the OID was used to update.

Regarding the upsert feature, I think you're right - it doesn't work as expected in the case where the dictionary doesn't contain all columns. I've already filed an issue with suggestion for improvement.

As a workaround, until this is fixed, you can explicitly declare via the keywords of the upsert() method which columns shall be updated. So you could do this:

rowData = {'phy': 1235, 'col2': 'test2', 'col3': 'test3'}
kw = {n: n in rowData for n in db.get_attnames(tableName)}
db.upsert(tableName, rowData, **kw)

I.e. you pass a dictionary that has False values for the columns that are not part of the specified rowData.