Currently I am setting up Master/Master Replication with bucardo between 5 Nodes on different locations (should provide location transparency). The database holds ~500 Tables which should be replicated. I grouped them into smaller replication herds of 50 Tables at maximum based on their dependency on each other. All tables have primary keys defined and the sequencers on each node are set up to provide system wide unique identities (based on residue class)
To get an initial database on each node, I made a --data-only
custom format pg_dump into a File and restored this on each node via pg_restore
. Bucardo sync is setup with the bucardo_latest
strategy to resolve conflicts. Now when I start syncing bucardo is deleting all datasets in the origin database first and inserting it again from one of the restored nodes, because all restored datasets have a "later timestamp" (the point in time when I called pg_restore). This ultimately prohibits the inital startup as bucardo needs very much time and also fails, as there are lots of datasets to solve and timeouts often too short.
I also have 'last_modified' timestamps on each table which are managed by UPDATE triggers, but as I understand it, pg_dump inserts data via COPY, and therefore these triggers don't get fired.
- Which timestamp does bucardo use to find out who is
bucardo_latest
? - Do I have to call
pg_dump
with something likeset SESSION_REPLICATION_ROLE = 'replica';
?
I just want bucardo to keep track of every new change, not executing pseudo changes because of the restore.
EDIT: pg_restore has definitely fired several triggers at restore time...as said I keep track on user and last modification date in each table, and those values are set to the user and timestamp when the restore was done. I am aware, that I can set SESSION_REPLICATION_ROLE for a plain text format restore via psql
. Is this also possible for pg_restore somehow?
The common approach is make the dump/restore process before configure the replication.
So an option will be:
bucardo
schema in each databasebucardo remove
for each object (most of them allow useall
, likebucardo remove table all
onetimecopy=0
. It's the default but I feel safer making it explicit.Which timestamp does bucardo use to find out who is bucardo_latest?
bucardo
handles its own timestamp value. Each table should have a trigger named likebucardo.delta_myschema_mytable
that makes and insert in a table named likebucardo.delta_myschema_mytable
. This table has a columntxntime timestamp with time zone not null default now()
and this is the timestamp used.Do I have to call pg_dump with something like set SESSION_REPLICATION_ROLE = 'replica';?
AFAIK, if
bucardo
triggers are already set in the tables, the option--disable-triggers
ofpg_restore
should do the trick.You can also check these articles about working with large databases and the use of session_replication_role