Can't connect to Neo4J server on Hetzner Cloud Kubernetes cluster, while same setup works on Azure AKS

208 views Asked by At

I'm creating a Kubernetes cluster on Hetzner Cloud with the same configuration I use on Azure AKS but I'm facing connection problems with Neo4j. On Hetzner cluster while I can access Neo4J browser from the path I defined in my Ingress, I can't connect to the Neo4j server using the bolt+s connection server.mydomain.com:7687 URL, nor does the Neo4j driver in my Node.js server pod (this second connection is kinda solved, see update at the end). This is not the case with the AKS cluster.

From Neo4j browser debbug connection I see that the Handshake fails:

Browser will attempt to open a websocket connection to bolt+s://server.mydomain.com:7687 and do an encrypted and an unencrypted bolt handshake.
bolt handshake
Status: 
Error
encrypted bolt handshake
Status: 
Error

From Chrome console I see 2 errors:

Mixed Content: The page at 'https://server.mydomain.com/neo4j/browser/' was loaded over HTTPS, but requested an insecure resource 'http://server.mydomain.com:7687/'. This request has been blocked; the content must be served over HTTPS.

WebSocket connection to 'wss://server.mydomain.com:7687/' failed:

The one difference between the two clusters is the ingress controller's Load Balancer configuration for which on Hetzner I set annotations in the ingress-nginx Helm chart as so:

nginx:
  controller:
    watchIngressWithoutClass: true
    kind: DaemonSet
    config:
      use-forwarded-headers: "true"
      compute-full-forwarded-for: "true"
      use-proxy-protocol: "true"
    service:
      annotations:
        load-balancer.hetzner.cloud/name: server-lb
        load-balancer.hetzner.cloud/use-private-ip: "true"
        load-balancer.hetzner.cloud/disable-private-ingress: "true"
        load-balancer.hetzner.cloud/location: fsn1
        load-balancer.hetzner.cloud/type: lb11
        load-balancer.hetzner.cloud/uses-proxyprotocol: "true"
        load-balancer.hetzner.cloud/http-redirect-https: "true"
        load-balancer.hetzner.cloud/hostname: server.mydomain.com
        # nginx.ingress.kubernetes.io/websocket-services: neo4j

    extraArgs:
      default-ssl-certificate: "default/tls-secret"  

    # nodeSelector:
    #   server-type: server  
  tcp:
    7687: "default/neo4j:7687" 
    7474: "default/neo4j:7474"

AFAIK ingress-nginx controller (which I'm using) handles WebSockets automatically unlike nginx-ingress for which should be mapped to a service using an annotation like nginx.ingress.kubernetes.io/websocket-services: neo4j, I tried using the annotation anyways but didn't make a difference.

The complete procedure I used for the Hetzner cluster is: I created a Kubernetes a single node cluster on Hetzner Cloud using k3s v1.27.4+k3s1, installed ingress-nginx v4.7.1 exposing TCP ports 7474 and 7687 to Neo4j service as you can see above (the Load Balancer TCP ports are exposed and healthy) and Cert-manager v1.12.3 Helm charts.

In my domain DNS manager I created an A record pointing to the load balancer IPv4 with host set as sever to use it in my Certificate and Ingress manifests as server.mydomain.com. The tls-secret gets created correctly.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-service
  annotations:
    nginx.ingress.kubernetes.io/use-regex: 'true'
    nginx.ingress.kubernetes.io/rewrite-target: /$2$3$4
    ingress.kubernetes.io/ssl-redirect: 'true'
    nginx.ingress.kubernetes/cluster-issuer: letsencrypt-issuer

spec:
  ingressClassName: nginx
  tls:
    - hosts:
        - server.mydomain.com
      secretName: tls-secret
  rules:

    ### Node.js server
    - http:
        paths:
          - path: /(/|$)(.*)
            # pathType: Prefix
            pathType: ImplementationSpecific
            backend:
              service:
                name: server-clusterip-service
                port:
                  number: 80
    - http:
        paths:
          - path: /server(/|$)(.*)
            # pathType: Prefix
            pathType: ImplementationSpecific
            backend:
              service:
                name: server-clusterip-service
                port:
                  number: 80

    ##### Neo4j

    - http:
        paths:
          - path: /bolt(/|$)(.*)
            # pathType: Prefix
            pathType: ImplementationSpecific
            backend:
              service:
                name: neo4j
                port:
                  number: 7687
    - http:
        paths:
          # show browser
          - path: /neo4j(/|$)(.*)
            # pathType: Prefix
            pathType: ImplementationSpecific
            backend:
              service:
                name: neo4j
                port:
                  number: 7474
    - http:
        paths:
          - path: /neo4j-admin(/|$)(.*)
            # pathType: Prefix
            pathType: ImplementationSpecific
            backend:
              service:
                name: neo4j-admin
                port:
                  number: 7474

To install Neo4j chart I'm setting these values for Neo4j configuration:

  config:
    server.bolt.enabled: 'true'
    server.bolt.tls_level: 'REQUIRED'
    server.bolt.listen_address: '0.0.0.0:7687'
    dbms.ssl.policy.bolt.client_auth: 'NONE'
    dbms.ssl.policy.bolt.enabled: 'true'

    # dbms.connector.bolt.advertised_address: '0.0.0.0:7687' #server.mydomain.com:7687 # new for hetzner (no connection still)

    ## apoc
    server.directories.plugins: '/var/lib/neo4j/labs'
    dbms.security.procedures.unrestricted: 'apoc.*'
    server.config.strict_validation.enabled: 'false'
    dbms.security.procedures.allowlist: 'gds.*,apoc.*'

    ### apoc config
    dbms.directories.plugins: "/var/lib/neo4j/labs"
    dbms.config.strict_validation: "false"



  apoc_config:
    apoc.trigger.enabled: "true"
    apoc.jdbc.neo4j.url: "jdbc:foo:bar"
    apoc.import.file.enabled: "true"

  startupProbe:
    failureThreshold: 1000
    periodSeconds: 50

  ssl:
    # setting per "connector" matching neo4j config
    bolt:
      privateKey:
        secretName: tls-secret
        subPath: tls.key
      publicCertificate:
        secretName: tls-secret
        subPath: tls.crt
      trustedCerts:
        sources: []
      revokedCerts:
        sources: []

I tried setting the dbms.connector.bolt.advertised_address(dough on Azure is not set) using both the any IP 0.0.0.0:7687 value and the specific dns server.mydomain.com:7687value but that didn't make a difference either. On the Hetzner Firewall rules I created rules for ports 80(http) and 443 (https) to allow to port 7474 and 7687. I also tried disabling the Firewall as a test but still can't reach Neo4j server. Can you spot some other configuration I need to add or change for this setup? Many thanks

Update

I noticed that the nginx-ingress-controller External IP onAzure was actually showing the IPv4 address from the load balancer, while on Hetzner it was showing the dns name server.mydomain.com so I removed the load-balancer.hetzner.cloud/hostname: server.mydomain.com annotation from ingress-nginx service annotations helm chart and without it the Neo4j driver in my Node.js server pod succeeds in connecting to Neo4j.

Unfortunately I still get the two errors when connecting from the Neo4j Browser app in the web browser:

Mixed Content: The page at 'https://server.mydomain.com/neo4j/browser/' was loaded over HTTPS, but requested an insecure resource 'http://server.mydomain.com:7687/'. This request has been blocked; the content must be served over HTTPS.

WebSocket connection to 'wss://server.mydomain.com:7687/' failed:

Update 2

I started fresh, and while issuing the Let'sEncrypt certificate, if I don't use the annotation load-balancer.hetzner.cloud/hostname: server.mydomain.com, Certificate issuance hangs, while with it completes as expected.

I'm completely going in circles here..

1

There are 1 answers

0
Vincenzo On

I reached the Neo4j team and apparently the issue might have to do with the Hetzner networking or their load balancers, so my implementation to expose the 2 tcp ports to the neo4j default service (as stated below) while working on Azure, doesn't work on Hetzner.

There are other options, such as the nginx-ingress controller which can be configured to support TCP connections but in this guide we’re shooting for something as simple as possible that you can do with standard Kubernetes resource types.

Now, the solution I ended up using is the preferred Neo4j way which is to use a dedicated LoadBalancer service just for it and using annotations as I do in the ingress-nginx chart values I create another Hetzner load balancer, dough it means creating a second tls certificate issuance for it and of course a little more expensive solution..

A Service with type: LoadBalancer is the only suitable option for Neo4j. Ingress cannot be used because Neo4j’s driver protocol communicates at the TCP level and most Ingress only support HTTP communication. Node Port services cannot be associated with a static IP address which makes setting up DNS and SSL very difficult. Using NodePort can additionally create configuration challenges for some Neo4j applications that expect to use port 7687 specifically for communication with Neo4j. Because of that we recommend only using LoadBalancer services.

They are working to make it work seamlessly when Neo4j is behind an Ingress, one of which is using something like Haproxy for which they are creating an Helm Chart, but for now they are just evaluating various methods. Nothing definitive yet..