Currently, I am using Debezium for transferring data changes from our MySQL Binlog into Kafka, and then into BigQuery.
I use the distributed-mode configuration with snapshot.mode=initial
. But unfortunately, the amount of data in our Mysql instances is too large because Debezium continuously read the Binlog from the beginning.
I am trying with snapshot.mode=schema_only
and auto.offset.reset=latest
. but there is still a spike in MySQL instance.
Is there any proper configuration for minimalize the heavy consumption of reading the Binlog?