Unable to redeploy the certificates post-expiry in openshift 3.11

3k views Asked by At

I have deployed openshift(okd) 3.11 using : https://github.com/openshift/openshift-ansible/tree/release-3.11 I would want to produce a scenario where certificates expire and test how the renewal certificates can be done.

Hence I have set following variables in the inventory as 1 day(so that certificates expire quickly):


As expected after 1 day the oc commands where not working and master-api, master-etcd pods where in exited state. Now i wanted to renew all the certificates hence i have run the re-deploy certificate play referring to https://docs.openshift.com/container-platform/3.11/install_config/redeploying_certificates.html#redeploying-all-certificates-current-ca

ansible-playbook -i openshift-ansible/playbooks/inventory.ini openshift-ansible/playbooks/redeploy-certificates.yml

But the this ansible play gets aborted with error:

TASK [Wait for master to restart] **********************************************************************************************************
skipping: [master.]

TASK [Wait for master API to come back online] *********************************************************************************************
skipping: [master.]

TASK [openshift_control_plane : restart master] ********************************************************************************************
changed: [master.] => (item=api)
changed: [master.] => (item=controllers)

RUNNING HANDLER [openshift_control_plane : verify API server] ******************************************************************************
FAILED - RETRYING: verify API server (120 retries left).
FAILED - RETRYING: verify API server (119 retries left).
FAILED - RETRYING: verify API server (2 retries left).
FAILED - RETRYING: verify API server (1 retries left).
fatal: [master.]: FAILED! => {
    "attempts": 120,
    "changed": false,
    "cmd": [
    "delta": "0:00:00.012426",
    "end": "2020-11-29 22:56:24.445762",
    "rc": 7,
    "start": "2020-11-29 22:56:24.433336"


non-zero return code

RUNNING HANDLER [openshift_control_plane : verify Local API server] ************************************************************************

Please let me know if im missing out anything while re-deploying certificates or any alternate way where we can renew these certificates.


I have also tried the redeploy-openshift-ca.yml playbook with -e openshift_redeploy_openshift_ca=true:

ansible-playbook -i openshift-ansible/playbooks/inventory.ini openshift-ansible/playbooks/openshift-master/redeploy-openshift-ca.yml -e openshift_redeploy_openshift_ca=true

But this play too fails at the same task as earlier where it is waiting for master-api to be running.

The master-api docker logs shows:

I1202 18:02:55.930375       1 plugins.go:84] Registered admission plugin "SecurityContextDeny"
I1202 18:02:55.930387       1 plugins.go:84] Registered admission plugin "ServiceAccount"
I1202 18:02:55.930396       1 plugins.go:84] Registered admission plugin "DefaultStorageClass"
I1202 18:02:55.930408       1 plugins.go:84] Registered admission plugin "PersistentVolumeClaimResize"
I1202 18:02:55.930418       1 plugins.go:84] Registered admission plugin "StorageObjectInUseProtection"
F1202 18:03:25.933354       1 start_api.go:68] dial tcp connect: connection refused

The etcd docker logs shows:

2020-12-02 18:05:14.459240 I | embed: ready to serve client requests
2020-12-02 18:05:14.459730 I | embed: serving client requests on
WARNING: 2020/12/02 18:05:14 Failed to dial connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.


There are 0 answers