Confluent Connect, Debezium, java.lang.OutOfMemoryError

516 views Asked by At

We are hosting a cluster of Confluent Connect v5.01 containers in Amazon EKS to run several instances of the Debezium for SQL Server connector v0.9.5. Sometimes when we reconfigure one of these connectors, it triggers a rebalance and one connector (not always the same one) seems to go rogue and consume all of the memory allocated to the container. This fills our logs with tens of thousands of slight variations of the following log entry ...

INFO Skipping change ChangeTablePointer [changeTable=Capture instance "REDACTED" [sourceTableId=REDACTED, changeTableId=REDACTED, startLsn=008e7c10:00313f98:0010, changeTableObjectId=914583783, stopLsn=NULL], resultSet=SQLServerResultSet:2825973, completed=false, currentChangePosition=008e9893:00010320:0035(008e9893:00010230:0046)] as its position is smaller than the last recorded position 008e9893:00010320:0035(008e9893:00010320:0033) (io.debezium.connector.sqlserver.SqlServerStreamingChangeEventSource)

... before finally hitting the container's memory limit, which brings down the container.

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "kafka-producer-network-thread | REDACTED-dbhistory"

This triggers another rebalance, which restarts the cycle.

We've tracked the issue down to this line of code in Debezium, which seems to indicate that the connector is trying to find where it left off when it was restarted but consumes too much memory in the process. Any leads you could offer to help us track down and address the issue would be greatly appreciated.

0

There are 0 answers