Environment set-up: Functional tests of a Java backend service are configured as a gitlab-ci stage. Tests are executed on a test container. A docker-compose file is defined with the following services: java backend application, Kafka, Postgres, and Wiremock. Test container uses dind to run prerequisite containers. dind and test containers are on the same docker daemon. dind container is reachable with the hostname, "docker" (alias name)
All prereq container exposes ports. Kafka container exposes ports 9092 for internal traffic and 9094 for outside traffic
KAFKA_ADVERTISED_LISTENERS: 'INSIDE://kafka:9092,OUTSIDE://localhost:9094'
KAFKA_LISTENERS: 'INSIDE://:9092,OUTSIDE://:9094'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 'INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT'
Environment variables associated to dind container for test container:
DOCKER_HOST: tcp://docker:2376
DOCKER_TLS_CERTDIR: "/certs"
DOCKER_TLS_VERIFY: 1
DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client"
JAVA_OPTS: "-XX:+UseContainerSupport -XX:InitialRAMPercentage=80"
Behavior:
Java backend application, Postgres, and Wiremock are accessible with hostname, "docker" and with respective ports for test containers.
For example, the Java backend application exposes port 18080, it is reachable with docker:18080. Curl command returned: 200 OK response.
Issue:
But, the Kafka container is unreachable at the host, docker:9094. When I tested from the Java backend application, the Kafka container reachable at kafka:9092 successfully. Apparently, the reason for using different hostnames from the test container and Java application containers is Java application and Kafka are on the same docker network, so, Kafka is reachable with the hostname, kafka. As the dind daemon has an alias, "docker", it is used as a hostname to reach the Kafka container.
dind version - docker:20.10.20-dind kafka image - wurstmeister/kafka:2.11-0.11.0.2 zookeeper - wurstmeister/zookeeper:3.4.6 wiremock version - wiremock:2.26.3 docker version - Docker version 20.10.23 build 7155243 (where the test container and dind runs)
test container and dind runs a bridge docker network All pre-req containers runs on a different custom docker network with dind as docker daemon
gitlab runner runs on EC2 instance.
Error stack from kafka java client from test container:
15:25:35.537 [Kafka-consumer] DEBUG o.a.kafka.common.network.Selector - [Consumer clientId=consumer-test-container-1, groupId=container-test] Connection with localhost/127.0.0.1 (channelId=1001) disconnected
java.net.ConnectException: Connection refused
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777)
at org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:50)
at org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:224)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:526)
at org.apache.kafka.common.network.Selector.poll(Selector.java:481)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:560)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:280)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:251)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:242)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:164)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:277)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:240)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.coordinatorUnknownAndUnreadySync(ConsumerCoordinator.java:492)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:524)
at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1276)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1240)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1220)
at com.ringcentral.qa.util.KafkaConsumer.run(KafkaConsumer.java:84)
at java.base/java.lang.Thread.run(Thread.java:829)
Observation: When I make http request from test container to kafka container, the request reaches the broker and below error stack is observed which means, kafka container accepts tcp request on port 9094.
curl command response:
* About to connect() to docker port 9094 (#0)
* Trying 172.17.0.2...
* Connected to docker (172.17.0.2) port 9094 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: docker:9094
> Accept: */*
>
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer
kafka broker logs:
2023-10-25 05:06:44,431] WARN Unexpected error from /172.17.0.3; closing connection (org.apache.kafka.common.network.Selector)
org.apache.kafka.common.network.InvalidReceiveException: Invalid receive (size = 1195725856 larger than 104857600)
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:95)
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:75)
at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:203)
at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:167)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:381)
at org.apache.kafka.common.network.Selector.poll(Selector.java:326)
at kafka.network.Processor.poll(SocketServer.scala:500)
at kafka.network.Processor.run(SocketServer.scala:435)
at java.lang.Thread.run(Thread.java:748)
Kafka consumer config:
allow.auto.create.topics = true
auto.commit.interval.ms = 5000
auto.offset.reset = latest
bootstrap.servers = [docker:9094]
check.crcs = true
client.dns.lookup = use_all_dns_ips
client.id = consumer-test-1
key.deserializer = class org.apache.kafka.common.serialization.LongDeserializer
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor, class org.apache.kafka.clients.consumer.CooperativeStickyAssignor]
security.protocol = PLAINTEXT
security.providers = null
send.buffer.bytes = 131072
session.timeout.ms = 45000
socket.connection.setup.timeout.max.ms = 30000
socket.connection.setup.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.certificates = null
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
Questions:
Why the java kafka client requests from test container could not reach kafka container? Does dind host network could not forward the request to kafka bridge network?
I tried to run functional tests that have a dependency on Kafka. Kafka client in functional tests fail with an error: "Connection refused"
I am trying to make the Kafka container reachable to test container.