Slony "duplicate key value violates unique constraint" error

754 views Asked by At

I have a problem which goes on for longer time. I use slony to replicate database from master to slave and from that slave to three other backup servers. Once a 2-3 weeks there is a key duplication problem that happens only on one specific table (big but not biggest in database).

It started to occur like year ago on Postgres 8.4 and slony 1 and we switched to 2.0.1. Later we upgraded it to 2.0.4, and we succesfuly upgraded slony to 2.1.3 and it's our current version. We started fresh replication on same computers and it was all going well until today. We got the same duplication key error on same table (with different keys every time of course).

Solution to clean it up is just to delete invalid key on slaves (it spreads across all nodes) and it's all working again. Data is not corrupted. But source of problem remains unsolved.

In googles I found nothing related to this problem (we did not used truncate on any table, we did not change the structure of table).

Any ideas what can be done about it?

2

There are 2 answers

0
knitti On

When this problem occured in our setup, it turned out that the schema of the master database was older than the slaves' and didn't have the UNIQUE constraint for this particular column. So, my advice would be:

  • make sure the master table has in fact the constraint

if not:

  • clean the table
  • add the constraint

else:

  • revoke write privileges from all clients except slony for the replicated tables.
0
Chris Travers On

As Craig has said usually this is a write transaction to a replica. So the first things to do is to verify permissions. If this keeps happening, what you can do is start logging connections of the readers of the replicas and keep them around so when the issue happens, you can track down where the bad tuple came from. This can generate a LOT of logs however so you probably want to see to what extent you can narrow this down first. You presumably know which replica this is starting on, so you can go from there.

A particular area of concern I would spot would be what happens if you have a user defined function which writes. A casual observer might not spot that in the query, nor might a connection pooler.