При запуске cert-manager
я получаю следующее сообщение
TLS handshake error from 10.42.152.128:38676: EOF
$ kubectl -n cert-manager logs cert-manager-webhook-8575f88c85-l4tlw
I0214 19:41:28.147106 1 main.go:64] "msg"="enabling TLS as certificate file flags specified"
I0214 19:41:28.147365 1 server.go:126] "msg"="listening for insecure healthz connections" "address"=":6080"
I0214 19:41:28.147418 1 server.go:138] "msg"="listening for secure connections" "address"=":10250"
I0214 19:41:28.147437 1 server.go:155] "msg"="registered pprof handlers"
I0214 19:41:28.147570 1 tls_file_source.go:144] "msg"="detected private key or certificate data on disk has changed. reloading certificate"
2020/02/14 19:43:32 http: TLS handshake error from 10.42.152.128:38676: EOF
Интересно, что нет стручка с этим IP
$ kubectl get pod -o wide --all-namespaces | grep 128
cert-manager cert-manager-webhook-8575f88c85-l4tlw 1/1 Running 0 4m56s 10.42.112.128 node002 <none> <none>
Аналогичная ошибка на cert-manager
pod
E0214 19:38:22.540589 1 controller.go:131] cert-manager/controller/ingress-shim "msg"="re-queuing item due to error processing" "error"="Internal error occurred: failed calling webhook \"webhook.cert-manager.io\": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: net/http: TLS handshake timeout" "key"="kube-system/dashboard-kubernetes-dashboard"
У меня есть два ClusterIssuer
kubectl get ClusterIssuer --namespace cert-manager
NAME READY AGE
letsencrypt-prd True 42d
letsencrypt-stg True 42d
Но сертификата пока нет:
kubectl get certificate --all-namespaces
No resources found
Когда я пытаюсь запросить сертификат, я получаю ту же ошибку
kubectl apply -f mycert.yml
Error from server (InternalError): error when creating "cert-wyssmann-dev.yml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: net/http: TLS handshake timeout
Я не уверен, как именно я могу добраться до сути проблемы. Я запустил sonobouy
, чтобы посмотреть, поможет ли это мне, однако тест не прошел на 2 из 3 моих узлов.
Plugin: e2e
Status: failed
Total: 1
Passed: 0
Failed: 1
Skipped: 0
Failed tests:
Container e2e is in a terminated state (exit code 1) due to reason: Error:
Plugin: systemd-logs
Status: failed
Total: 3
Passed: 1
Failed: 2
Skipped: 0
Failed tests:
timeout waiting for results
Для неисправных узлов это можно увидеть в sonobouy
журналах
E0214 19:38:22.540589 1 controller.go:131] cert-manager/controller/ingress-shim "msg"="re-queuing item due to error processing" "error"="Internal error occurred: failed calling webhook \"webhook.cert-manager.io\": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: net/http: TLS handshake timeout" "key"="kube-system/dashboard-kubernetes-dashboard"