Recovering from broken foreign key leftovers on postgresql

419 views Asked by At

We have a very simple one-to-many relation. site and resolver tables. resolver has a foreign key for site_id.

We decided to remove the resolver table as it's not useful anymore. So we just simply did DROP TABLE resolver. However, we use 2ndQuadrant BDRv1 for async multi-multi replication and it has a bug. It doesn't cleanup the constraints properly when you do that. We actually noticed this before for another use case and dropped the foreign key and primary keys before dropping the table and it worked out great. This time we missed it and we have this corrupt state in the field.

It doesn't cause much trouble except when you try to DELETE the site (which is thankfully not a common scenario).

could not open relation with OID 16904

 

There are workarounds but we need to clean this up on the next release. It's a product deployed by many customers, not our own deployment. So it needs to be as smooth as possible. So, while "recreate the database or table" etc. are valid answers (and probably should appear as answers, preferably with commands or links), they are not always the most feasible to accomplish on a customer deployment.

So, dropping constraint doesn't work because the table doesn't exist anymore.

ALTER TABLE site DROP CONSTRAINT resolver_site_id_fk;
ERROR:  constraint "resolver_site_id_fk" of relation "site" does not exist
ALTER TABLE resolver DROP CONSTRAINT resolver_site_id_fk;
ERROR:  relation "resolver" does not exist

So, I decided to accomplish what it does by messing up with the internal tables by looking around. I know, great idea..

delete from pg_depend WHERE objid IN (SELECT confrelid FROM pg_constraint WHERE conname = 'resolver_site_id_fk');
delete from pg_trigger WHERE tgconstrrelid IN (SELECT conrelid FROM pg_constraint WHERE conname = 'resolver_site_id_fk');
delete from pg_constraint WHERE conname = 'resolver_site_id_fk' OR conname = 'resolver_pkey';

after this, deleting a site still fails with

 cache lookup failed for constraint 16912

which is resolved by restarting postgresql which is not ideal.

 

I guess my questions are:

  • how can I avoid restart? There should be a way to tell the cache to remove it also like DROP CONSTRAINT does. ANSWER: it seems restarting the application (or probably the JDBC connection in my case) is enough.
  • what else should I clean up?
  • what alternative methods can I use instead of messing with the internal tables?
0

There are 0 answers