Spring Boot 3.2 - Multicluster MongoDB Connection with NodePort

86 views Asked by At

I have a Spring Boot 3.2 (RC2) based application that connects to MongoDB. My MongoDB is a ReplicaSet deployed on two Kubernetes clusters:

  • three nodes on cluster A (and one arbiter)
  • three nodes on cluster B The nodes (therefore pods) are exposed via NodePort, so all the synchronization of the ReplicaSet takes place via NodePort.

My rs.status() gives a result similar to the following:

{
  set: 'rsTest',
  date: ISODate("2023-11-17T08:12:31.374Z"),
  myState: 1,
  term: Long("2"),
  syncSourceHost: '',
  syncSourceId: -1,
  heartbeatIntervalMillis: Long("2000"),
  majorityVoteCount: 4,
  writeMajorityCount: 4,
  votingMembersCount: 7,
  writableVotingMembersCount: 6,
  optimes: {
    lastCommittedOpTime: { ts: Timestamp({ t: 1700208749, i: 2 }), t: Long("2") },
    lastCommittedWallTime: ISODate("2023-11-17T08:12:29.165Z"),
    readConcernMajorityOpTime: { ts: Timestamp({ t: 1700208749, i: 2 }), t: Long("2") },
    appliedOpTime: { ts: Timestamp({ t: 1700208750, i: 6 }), t: Long("2") },
    durableOpTime: { ts: Timestamp({ t: 1700208750, i: 6 }), t: Long("2") },
    lastAppliedWallTime: ISODate("2023-11-17T08:12:30.553Z"),
    lastDurableWallTime: ISODate("2023-11-17T08:12:30.553Z")
  },
  lastStableRecoveryTimestamp: Timestamp({ t: 1700208716, i: 7 }),
  electionCandidateMetrics: {
    lastElectionReason: 'electionTimeout',
    lastElectionDate: ISODate("2023-11-10T11:37:02.387Z"),
    electionTerm: Long("2"),
    lastCommittedOpTimeAtElection: { ts: Timestamp({ t: 0, i: 0 }), t: Long("-1") },
    lastSeenOpTimeAtElection: { ts: Timestamp({ t: 1699616220, i: 9 }), t: Long("1") },
    numVotesNeeded: 1,
    priorityAtElection: 5,
    electionTimeoutMillis: Long("10000"),
    newTermStartDate: ISODate("2023-11-10T11:37:02.392Z"),
    wMajorityWriteAvailabilityDate: ISODate("2023-11-10T11:37:02.399Z")
  },
  members: [
    {
      _id: 0,
      name: '<WORKER-NODE-CLUSTER-A-IP-1>:30100',
      health: 1,
      state: 1,
      stateStr: 'PRIMARY',
      uptime: 592530,
      optime: { ts: Timestamp({ t: 1700208750, i: 6 }), t: Long("2") },
      optimeDate: ISODate("2023-11-17T08:12:30.000Z"),
      lastAppliedWallTime: ISODate("2023-11-17T08:12:30.553Z"),
      lastDurableWallTime: ISODate("2023-11-17T08:12:30.553Z"),
      syncSourceHost: '',
      syncSourceId: -1,
      infoMessage: '',
      electionTime: Timestamp({ t: 1699616222, i: 1 }),
      electionDate: ISODate("2023-11-10T11:37:02.000Z"),
      configVersion: 17,
      configTerm: 2,
      self: true,
      lastHeartbeatMessage: ''
    },
    {
      _id: 1,
      name: '<ARBITER-SVC>:27017',
      health: 1,
      state: 7,
      stateStr: 'ARBITER',
      uptime: 592517,
      lastHeartbeat: ISODate("2023-11-17T08:12:30.975Z"),
      lastHeartbeatRecv: ISODate("2023-11-17T08:12:30.977Z"),
      pingMs: Long("0"),
      lastHeartbeatMessage: '',
      syncSourceHost: '',
      syncSourceId: -1,
      infoMessage: '',
      configVersion: 17,
      configTerm: 2
    },
    {
      _id: 2,
      name: '<WORKER-NODE-CLUSTER-A-IP-2>:30101',
      health: 1,
      state: 2,
      stateStr: 'SECONDARY',
      uptime: 592478,
      optime: { ts: Timestamp({ t: 1700208749, i: 6 }), t: Long("2") },
      optimeDurable: { ts: Timestamp({ t: 1700208749, i: 6 }), t: Long("2") },
      optimeDate: ISODate("2023-11-17T08:12:29.000Z"),
      optimeDurableDate: ISODate("2023-11-17T08:12:29.000Z"),
      lastAppliedWallTime: ISODate("2023-11-17T08:12:30.553Z"),
      lastDurableWallTime: ISODate("2023-11-17T08:12:30.553Z"),
      lastHeartbeat: ISODate("2023-11-17T08:12:30.080Z"),
      lastHeartbeatRecv: ISODate("2023-11-17T08:12:31.179Z"),
      pingMs: Long("0"),
      lastHeartbeatMessage: '',
      syncSourceHost: '<WORKER-NODE-CLUSTER-A-IP-1>:30100',
      syncSourceId: 0,
      infoMessage: '',
      configVersion: 17,
      configTerm: 2
    },
    {
      _id: 3,
      name: '<WORKER-NODE-CLUSTER-A-IP-3>:30102',
      health: 1,
      state: 2,
      stateStr: 'SECONDARY',
      uptime: 592447,
      optime: { ts: Timestamp({ t: 1700208749, i: 6 }), t: Long("2") },
      optimeDurable: { ts: Timestamp({ t: 1700208749, i: 6 }), t: Long("2") },
      optimeDate: ISODate("2023-11-17T08:12:29.000Z"),
      optimeDurableDate: ISODate("2023-11-17T08:12:29.000Z"),
      lastAppliedWallTime: ISODate("2023-11-17T08:12:30.553Z"),
      lastDurableWallTime: ISODate("2023-11-17T08:12:30.553Z"),
      lastHeartbeat: ISODate("2023-11-17T08:12:30.190Z"),
      lastHeartbeatRecv: ISODate("2023-11-17T08:12:30.775Z"),
      pingMs: Long("0"),
      lastHeartbeatMessage: '',
      syncSourceHost: '<WORKER-NODE-CLUSTER-A-IP-2>:30101',
      syncSourceId: 2,
      infoMessage: '',
      configVersion: 17,
      configTerm: 2
    },
    {
      _id: 4,
      name: '<WORKER-NODE-CLUSTER-B-IP-1>:30100',
      health: 1,
      state: 2,
      stateStr: 'SECONDARY',
      uptime: 25736,
      optime: { ts: Timestamp({ t: 1700208749, i: 2 }), t: Long("2") },
      optimeDurable: { ts: Timestamp({ t: 1700208749, i: 2 }), t: Long("2") },
      optimeDate: ISODate("2023-11-17T08:12:29.000Z"),
      optimeDurableDate: ISODate("2023-11-17T08:12:29.000Z"),
      lastAppliedWallTime: ISODate("2023-11-17T08:12:29.165Z"),
      lastDurableWallTime: ISODate("2023-11-17T08:12:29.165Z"),
      lastHeartbeat: ISODate("2023-11-17T08:12:30.370Z"),
      lastHeartbeatRecv: ISODate("2023-11-17T08:12:28.906Z"),
      pingMs: Long("596"),
      lastHeartbeatMessage: '',
      syncSourceHost: '<WORKER-NODE-CLUSTER-A-IP-2>:30101',
      syncSourceId: 2,
      infoMessage: '',
      configVersion: 17,
      configTerm: 2
    },
    {
      _id: 5,
      name: '<WORKER-NODE-CLUSTER-B-IP-2>:30101',
      health: 1,
      state: 2,
      stateStr: 'SECONDARY',
      uptime: 21862,
      optime: { ts: Timestamp({ t: 1700208749, i: 2 }), t: Long("2") },
      optimeDurable: { ts: Timestamp({ t: 1700208749, i: 2 }), t: Long("2") },
      optimeDate: ISODate("2023-11-17T08:12:29.000Z"),
      optimeDurableDate: ISODate("2023-11-17T08:12:29.000Z"),
      lastAppliedWallTime: ISODate("2023-11-17T08:12:29.165Z"),
      lastDurableWallTime: ISODate("2023-11-17T08:12:29.165Z"),
      lastHeartbeat: ISODate("2023-11-17T08:12:30.790Z"),
      lastHeartbeatRecv: ISODate("2023-11-17T08:12:29.443Z"),
      pingMs: Long("175"),
      lastHeartbeatMessage: '',
      syncSourceHost: '<WORKER-NODE-CLUSTER-A-IP-2>:30101',
      syncSourceId: 2,
      infoMessage: '',
      configVersion: 17,
      configTerm: 2
    },
    {
      _id: 6,
      name: '<WORKER-NODE-CLUSTER-B-IP-2>:30102',
      health: 1,
      state: 2,
      stateStr: 'SECONDARY',
      uptime: 41858,
      optime: { ts: Timestamp({ t: 1700208742, i: 1 }), t: Long("2") },
      optimeDurable: { ts: Timestamp({ t: 1700208741, i: 5 }), t: Long("2") },
      optimeDate: ISODate("2023-11-17T08:12:22.000Z"),
      optimeDurableDate: ISODate("2023-11-17T08:12:21.000Z"),
      lastAppliedWallTime: ISODate("2023-11-17T08:12:22.457Z"),
      lastDurableWallTime: ISODate("2023-11-17T08:12:21.511Z"),
      lastHeartbeat: ISODate("2023-11-17T08:12:29.040Z"),
      lastHeartbeatRecv: ISODate("2023-11-17T08:12:28.382Z"),
      pingMs: Long("1115"),
      lastHeartbeatMessage: '',
      syncSourceHost: '<WORKER-NODE-CLUSTER-A-IP-1>:30100',
      syncSourceId: 0,
      infoMessage: '',
      configVersion: 17,
      configTerm: 2
    }
  ],
  ok: 1,
  '$clusterTime': {
    clusterTime: Timestamp({ t: 1700208750, i: 6 }),
    signature: {
      hash: Binary.createFromBase64("3pWba14pRUH9FpER+6qWF6I=", 0),
      keyId: Long("7299796089241075715")
    }
  },
  operationTime: Timestamp({ t: 1700208750, i: 6 })
}

So the members of the ReplicaSet were randomly assigned to worker nodes of my Kubernetes clusters. So far so good.

In the Spring Boot application the value of spring.data.mongodb.uri is as follows:

mongodb://my_user:my_pwd@
<MASTER-NODE-CLUSTER-A-IP-1>:30100,
<MASTER-NODE-CLUSTER-A-IP-2>:30101,
<MASTER-NODE-CLUSTER-A-IP-3>:30102,
<INFRA-NODE-CLUSTER-A-IP-1>:30100,
<INFRA-NODE-CLUSTER-A-IP-2>:30101,
<INFRA-NODE-CLUSTER-A-IP-3>:30102,
<MASTER-NODE-CLUSTER-B-IP-1>:30100,
<MASTER-NODE-CLUSTER-B-IP-2>:30101,
<MASTER-NODE-CLUSTER-B-IP-3>:30102,
<INFRA-NODE-CLUSTER-B-IP-1>:30100,
<INFRA-NODE-CLUSTER-B-IP-2>:30101,
<INFRA-NODE-CLUSTER-B-IP-3>:30102
/my_db?replicaSet=rsTest&authSource=my_db

instead of connecting to the IPs of the nodes assigned to the ReplicaSet, I use the NodePorts to use the IPs of my Master and Infra nodes (more stable) of my Kubernetes clusters.

The result is that sometimes the application starts correctly and works although I rightly see logs like this:

<MASTER-NODE-..> is no longer a member of the replica set. Removing from client view of cluster

but often it doesn't start due to the MongoHealthIndicator timeout:

o.s.b.a.health.HealthEndpointSupport : Health contributor org.springframework.boot.actuate.data.mongo.MongoHealthIndicator (mongo) took 30103ms to respond

caused by:

Timed out after 30000 ms while waiting for a server that matches ReadPreferenceServerSelector{readPreference=primary}. Client view of cluster state is {type=REPLICA_SET, servers=[{address=WORKER-NODE-CLUSTER-A-IP-3:30102, type=REPLICA_SET_SECONDARY, roundTripTime=20.0 ms, state=CONNECTED}, {address=ARBITER-SVC:27017, state=CONNECTED}, {address=WORKER-NODE-CLUSTER-A-IP-2:30101, type=REPLICA_SET_SECONDARY, roundTripTime=20.7 ms, state=CONNECTED}, {address=WORKER-NODE-CLUSTER-A-IP-1:30100, type=UNKNOWN, state=CONNECTING}, {address=WORKER-NODE-CLUSTER-B-IP-3:30102, type=REPLICA_SET_SECONDARY, roundTripTime=715.3 ms, state=CONNECTED}, {address=WORKER-NODE-CLUSTER-B-IP-2:30101, type=REPLICA_SET_SECONDARY, roundTripTime=131.8 ms, state=CONNECTED}, {address=WORKER-NODE-CLUSTER-B-IP-1:30100, type=REPLICA_SET_SECONDARY, roundTripTime=793.6 ms, state=CONNECTED}]

apparently it can't connect to the primary.

Can anyone point me in the right direction? Why sometimes the application starts without problems but instead I often get these errors and the application doesn't start?

Thanks everyone in advance!

0

There are 0 answers