Have a kubernetes cluster with an nginx ingress to a service which I am trying to set up with https access using cert-manager
and ACME ClusterIssuer
.
The steps I have followed from cert-manager I am reasonably happy with but I am currently at the stage where a challenge is made to http solver which cert-manager has configured in the cluster as part of the challenge process. When I describe the service's generated challenge I see that its state is pending with:
Reason: Waiting for http-01 challenge propagation: failed to perform self check GET request 'http://www.example.com/.well-known/acme-challenge/nDWOHEMXgy70_wxi53ijEKjUHFlzg_UJJS-sv_ahGzg': Get "http://www.example.com/.well-known/acme-challenge/nDWOHEMXgy70_wxi53ijEKjUHFlzg_UJJS-sv_ahGzg": dial tcp xx.xx.xx.xxx:80: connect: connection timed out
When I call the solver's url from my k8s host server:
curl -H "Host: www.example.com" http://192.168.1.11:31344/.well-known/acme-challenge/nDWOHEMXgy70_wxi53ijEKjUHFlzg_UJJS-sv_ahGzg
I get a 200 ok back.
NOTE: The address 192.168.1.11 is the ip of the k8s node on which the http solver pod is running. And port 31344 is the internal port of the nodeIp service for the http solver pod.
I am trying to figure out why the challenge itself times out and not get a 200 back.
I have tested the http solver's url from my mobile phone over 4g (instead of wifi) and this way I get 200 OK so, this tells me that the http solver is reachable from the outside through the firewall and via nginx into the service and pod right? And so, if this is the case then what other reason(s) could there be for Let's Encrypt not being able to retrieve the token from the same URL?
--- CURRENT CONFIGS ---
Cluster Issuer:
apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
name: letsencrypt-staging
namespace: cert-manager
spec:
acme:
# The ACME server URL
server: https://acme-staging-v02.api.letsencrypt.org/directory
# Email address used for ACME registration
email: [email protected]
# Name of a secret used to store the ACME account private key
privateKeySecretRef:
name: letsencrypt-staging
# Enable the HTTP-01 challenge provider
solvers:
- selector: {}
http01:
ingress:
class: nginx
Ingress:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: ing-myservice-web
namespace: myservice
annotations:
kubernetes.io/ingress.class: "nginx"
cert-manager.io/cluster-issuer: "letsencrypt-staging"
spec:
tls:
- hosts:
- www.example.com
secretName: secret-myservice-web-tls
rules:
- host: www.example.com
http:
paths:
- backend:
serviceName: svc-myservice-web
servicePort: 8080
path: /
- host: www.example.co.uk
http:
paths:
- backend:
serviceName: svc-myservice-web
servicePort: 8080
path: /
After reading up about various different aspects of how
cert-manager
works, reading up about other peoples' similar issues on other posts and getting a better understanding on how my network is set up and is seen from the outside, I present below what I've learnt about my setup and thereafter what I did in order to getcert-manager
working for my domain services in the k8s cluster within.Setup:
nginx
ingress controller with aNodePort
service exposing ports 25080 and 25443 for http and https respectively.Solution:
Configured a local
http proxy
running on port 80 outside the k8s cluster which forwards requests to thenginx controller
'sNodePort
IP and port 25080.Configured
bind9
on my network to point www to host where localhttp proxy
is running.Configured the k8s cluster's
CoreDNS
to point tobind9
host (Instead of 8.8.4.4, etc.)Configured my private network's entry point router to send any address port 80 to
nginx controller
'sNodePort
IP and port 25080.Configured my private network's entry point router to send any address port 443 to
nginx controller
'sNodePort
IP and port 25443.The main reason for this solution is that my ISP does not allow hosts within my private network to call out and back into the network via the network's public IP address. (I believe this is quite common for ISPs and it's called Harpining or NAT Loopback, and some routers have the functionality to turn it on).
So, in order for the
cert-manager
'shttp solver
pod (running within the k8s cluster) to be able to complete the challenge it was necessary for it to be able reach thenginx controller
by forcing the network routing for www via the locally hostedhttp proxy
instead of going out to the world wide web and back in again (which my ISP does not allow).With this solution in place the
http solver
pod was able to complete the challenge and thereaftercert-manager
was able to issue certificates successfully.I am sure (and I hope) there are better and cleaner solutions to solve this sort of scenario out there but I have not come across any myself yet so this is the solution I currently have in place.