Postgres/JDBC/Logical replication - out of memory issues

3.6k views Asked by At

I am developing an application which connects to a logical replication slot, to consume the WAL events. These WAL events are then forwarded to a MQ broker. This works great, but I noticed that I am running out of memory after some time. I managed to minimize the problem to the code responsible for getting the WAL events. It occurs with the following code:

final Properties properties = new Properties();

PGProperty.USER.set(properties, "user");
PGProperty.PASSWORD.set(properties, "password");
PGProperty.ASSUME_MIN_SERVER_VERSION.set(properties, "9.4");
PGProperty.REPLICATION.set(properties, "database");
PGProperty.PREFER_QUERY_MODE.set(properties, "simple");

while (true) {
    Connection          connection   = null;
    PGConnection        PGConnection = null;
    PGReplicationStream stream       = null;

    try {
        connection = DriverManager.getConnection("jdbc:postgresql://localhost:5432/db", properties);
        PGConnection = connection.unwrap(PGConnection.class);
        stream = PGConnection.getReplicationAPI().replicationStream().logical().withSlotName("slot").start();

        while (true) {
            final ByteBuffer buffer = stream.read();

            // ... logic here ... (disabled during memory test)

            stream.setAppliedLSN(stream.getLastReceiveLSN());
            stream.setFlushedLSN(stream.getLastReceiveLSN());
        }
    } catch (final SQLException e1) {
        Logger.getLogger(getClass()).error(e1);

        if (stream != null) {
            try {
                stream.close();
            } catch (final SQLException e2) {
                Logger.getLogger(getClass()).error(e2);
            }
        }
        if (connection != null) {
            try {
                connection.close();
            } catch (final SQLException e2) {
                Logger.getLogger(getClass()).error(e2);
            }
        }
    }
}

I commented out the logic to parse the message and forward it to the MQ broker, as the out of memory also occurs without this.

I also tried to alter this example by using the polling method readPending() instead of the blocking method read() (as shown at https://jdbc.postgresql.org/documentation/head/replication.html), but the problem remains.

I also noticed that after a while, the application comes in a 100% CPU usage. This must be caused by the underlying libraries, as the read() is still processing normally at that moment (that is, it processes each WAL event sequentially).

During these tests, I am executing INSERT and UPDATE queries, at a low rate.

I am using following dependency:

<dependency>
    <groupId>org.postgresql</groupId>
    <artifactId>postgresql</artifactId>
    <version>42.1.4</version>
</dependency>

The application runs as a WAR in a Tomcat8 container.

Any idea what is going on?

update 1

I figured out what is going on, but cannot explain it so far. I'll go in details.

Every 10 seconds, I do INSERT and UPDATE queries, as said. These queries result in 645 WAL events. So every 10 seconds, I have to read() 645 events. In the beginning, this takes 0 (or sometimes 1) milliseconds to read() one event. After some time, it takes 1 milliseconds. Then, again some time later, it takes 2 milliseconds. And so on...

So after a while, I am unable to read() 645 events in 10 seconds, as the time required to read() keeps increasing. This explains the 100% CPU usage and the out of memory.

I am still unsure how to explain and how to fix this. I will keep investigating.

update 2

I tried adding buffer.clear() at the end of the loop, without success. I am still running into 100% CPU and memory issues. This is as expected, because the buffer is a local variable, so it's GC'ed after each loop any way. But I thought it would be a good idea to test anyway.

1

There are 1 answers

0
YasonTR On BEST ANSWER

I found the reason why I was running out of memory. I was testing with the decoderbufs decoding output plugin, https://github.com/xstevens/decoderbufs. When replacing by the built-in test plugin or wal2json (https://github.com/eulerto/wal2json), I am not having these problems.

I will try to inform the author of the decoderbufs plugin.