cert-manager - Acme Http Solver Returns 404

3.9k views Asked by At

Have a kubernetes cluster with an nginx ingress to a service which I am trying to set up with https access using cert-manager and ACME ClusterIssuer.

The steps I have followed from cert-manager I am reasonably happy with but I am currently at the stage where a challenge is made to http solver which cert-manager has configured in the cluster as part of the challenge process. When I describe the service's generated challenge I see that its state is pending with:

Reason:      Waiting for http-01 challenge propagation: failed to perform self check GET request 'http://www.example.com/.well-known/acme-challenge/nDWOHEMXgy70_wxi53ijEKjUHFlzg_UJJS-sv_ahGzg': Get "http://www.example.com/.well-known/acme-challenge/nDWOHEMXgy70_wxi53ijEKjUHFlzg_UJJS-sv_ahGzg": dial tcp xx.xx.xx.xxx:80: connect: connection timed out

When I call the solver's url from my k8s host server:

curl -H "Host: www.example.com" http://192.168.1.11:31344/.well-known/acme-challenge/nDWOHEMXgy70_wxi53ijEKjUHFlzg_UJJS-sv_ahGzg

I get a 200 ok back.

NOTE: The address 192.168.1.11 is the ip of the k8s node on which the http solver pod is running. And port 31344 is the internal port of the nodeIp service for the http solver pod.

I am trying to figure out why the challenge itself times out and not get a 200 back.

I have tested the http solver's url from my mobile phone over 4g (instead of wifi) and this way I get 200 OK so, this tells me that the http solver is reachable from the outside through the firewall and via nginx into the service and pod right? And so, if this is the case then what other reason(s) could there be for Let's Encrypt not being able to retrieve the token from the same URL?

--- CURRENT CONFIGS ---

Cluster Issuer:

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
 name: letsencrypt-staging
 namespace: cert-manager
spec:
 acme:
   # The ACME server URL
   server: https://acme-staging-v02.api.letsencrypt.org/directory
   # Email address used for ACME registration
   email: [email protected]
   # Name of a secret used to store the ACME account private key
   privateKeySecretRef:
     name: letsencrypt-staging
   # Enable the HTTP-01 challenge provider
   solvers:
   - selector: {}
     http01:
       ingress:
         class: nginx

Ingress:

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: ing-myservice-web
  namespace: myservice
  annotations:
    kubernetes.io/ingress.class: "nginx"
    cert-manager.io/cluster-issuer: "letsencrypt-staging"
spec:
  tls:
  - hosts:
    - www.example.com
    secretName: secret-myservice-web-tls
  rules:
  - host: www.example.com
    http:
      paths:
      - backend:
          serviceName: svc-myservice-web
          servicePort: 8080
        path: /
  - host: www.example.co.uk
    http:
      paths:
        - backend:
            serviceName: svc-myservice-web
            servicePort: 8080
          path: /
1

There are 1 answers

0
Going Bananas On

After reading up about various different aspects of how cert-manager works, reading up about other peoples' similar issues on other posts and getting a better understanding on how my network is set up and is seen from the outside, I present below what I've learnt about my setup and thereafter what I did in order to get cert-manager working for my domain services in the k8s cluster within.

Setup:

  • kubernetes cluster with backend services fronted by nginx ingress controller with a NodePort service exposing ports 25080 and 25443 for http and https respectively.
  • kubernetes cluster in private network behind ISP's public IP.

Solution:

  • Configured a local http proxy running on port 80 outside the k8s cluster which forwards requests to the nginx controller's NodePort IP and port 25080.

  • Configured bind9 on my network to point www to host where local http proxy is running.

  • Configured the k8s cluster's CoreDNS to point to bind9 host (Instead of 8.8.4.4, etc.)

  • Configured my private network's entry point router to send any address port 80 to nginx controller's NodePort IP and port 25080.

  • Configured my private network's entry point router to send any address port 443 to nginx controller's NodePort IP and port 25443.

The main reason for this solution is that my ISP does not allow hosts within my private network to call out and back into the network via the network's public IP address. (I believe this is quite common for ISPs and it's called Harpining or NAT Loopback, and some routers have the functionality to turn it on).

So, in order for the cert-manager's http solver pod (running within the k8s cluster) to be able to complete the challenge it was necessary for it to be able reach the nginx controller by forcing the network routing for www via the locally hosted http proxy instead of going out to the world wide web and back in again (which my ISP does not allow).

With this solution in place the http solver pod was able to complete the challenge and thereafter cert-manager was able to issue certificates successfully.

I am sure (and I hope) there are better and cleaner solutions to solve this sort of scenario out there but I have not come across any myself yet so this is the solution I currently have in place.