I'm trying to provision Oracle-to-MySQL replication using the parallel extraction method outlined in the Tungsten Replicator documentation.
- Setup CDC tables in Oracle using the setupCDC.sh script provided by Tungsten.
- Start the parallel extractor, specifying the starting SCN of the CDC process given by the previous script.
- The parallel extractor will insert all existing data using flaskback queries of the form
AS OF SCN ...
, performing a point-in-time provisioning with data integrity.
The problem is the setupCDC script prints out an SCN only if the CDC is asynchronous. It's hinted in an official forum thread that this is to "get a single position for the whole schema snapshot."
Due to licensing restrictions, I can only use synchronous CDC. Is it safe to manually read the SCN recorded in the all_capture
table and use it for provisioning? What are my options that can achieve both data integrity and minimum downtime?
- a. Disable write operations to the master database while provisioning is in progress:
- This is non-desirable as my database holds hundreds of gigabytes of data, probably resulting in a long downtime.
- b. Allow write operations during provisioning: any discrepancies will be fixed by re-applying all CDC data through normal replication after parallel extraction has processed all tables. Any errors raised during the re-application will have to be ignored.
- Would this be safe, from the viewpoint of data integrity?
For synchronous CDC, there is no entry in the
all_capture
table, which is for asynchronous capture processes.Instead, each change table records the SCN at the time of its creation. You can determine the lowest SCN from the
change_tables
table and provide it as the argument to the provisioning command:trepctrl online -provision <scn>
.(Replace
{service_name}
with your own service name.)