x509: проблемы с сертификатами на стручках фланели и сердечников - PullRequest
0 голосов
/ 26 ноября 2018

Ценю любую помощь в получении первопричины этой ошибки.Я создал кластер kubernetes 1.11.1 с 1 мастером и 2 узлами (мастер и один узел на одной машине) на виртуальных машинах CentOS 7.5.

$ uname -a
Linux master.home 3.10.0-862.14.4.el7.x86_64 #1 SMP Wed Sep 26 15:12:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/centos-release
CentOS Linux release 7.5.1804 (Core)

$ docker version
Client:
 Version:      17.09.1-ce
 API version:  1.32
 Go version:   go1.8.3
 Git commit:   19e2cf6
 Built:        Thu Dec  7 22:23:40 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.09.1-ce
 API version:  1.32 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   19e2cf6
 Built:        Thu Dec  7 22:25:03 2017
 OS/Arch:      linux/amd64
 Experimental: false

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:17:28Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:08:34Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}

IP-адреса интерфейса узла выглядят хорошо для меня:

$ ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:5b:19:5f brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.111/24 brd 192.168.1.255 scope global dynamic enp0s3
       valid_lft 82102sec preferred_lft 82102sec
    inet6 fe80::a00:27ff:fe5b:195f/64 scope link tentative dadfailed 
       valid_lft forever preferred_lft forever
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:38:a6:c7:bd brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:38ff:fea6:c7bd/64 scope link 
       valid_lft forever preferred_lft forever
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default 
    link/ether 26:1e:c3:e9:a3:db brd ff:ff:ff:ff:ff:ff
    inet 10.150.69.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::241e:c3ff:fee9:a3db/64 scope link 
       valid_lft forever preferred_lft forever
12: vethc0ae215@if11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether 9a:1c:9d:21:18:57 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::981c:9dff:fe21:1857/64 scope link 
       valid_lft forever preferred_lft forever

кластер etcd исправен:

$ etcdctl --endpoints=https://192.168.1.111:2379 --cert-file=/var/lib/kubernetes/kubernetes.pem --key-file=/var/lib/kubernetes/kubernetes-key.pem cluster-health
member ca38fd8eb3e17372 is healthy: got healthy result from https://192.168.1.111:2379
cluster is healthy
$ etcdctl --endpoints=https://192.168.1.111:2379 --cert-file=/var/lib/kubernetes/kubernetes.pem --key-file=/var/lib/kubernetes/kubernetes-key.pem get /atomic.io/network/config Network
{ "Network": "10.150.0.0/16", "SubnetLen": 24, "Backend": {"Type": "vxlan"}}

Также обновлены iptables:

$ iptables --version
iptables v1.6.2

Обзор из kubectl:

$ kubectl get all --all-namespaces
NAMESPACE     NAME                              READY     STATUS             RESTARTS   AGE
kube-system   pod/coredns-55f86bf584-9vz6k      1/1       Running            11         39m
kube-system   pod/coredns-55f86bf584-z4nvv      1/1       Running            11         39m
kube-system   pod/kube-flannel-ds-amd64-kw972   0/1       CrashLoopBackOff   6          10m
kube-system   pod/kube-flannel-ds-amd64-rhv2c   0/1       CrashLoopBackOff   6          10m

NAMESPACE     NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
default       service/kubernetes   ClusterIP   10.32.0.1    <none>        443/TCP         2h
kube-system   service/kube-dns     ClusterIP   10.32.0.10   <none>        53/UDP,53/TCP   39m

NAMESPACE     NAME                                     DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR                     AGE
kube-system   daemonset.apps/kube-flannel-ds-amd64     2         2         0         2            0           beta.kubernetes.io/arch=amd64     10m
kube-system   daemonset.apps/kube-flannel-ds-arm       0         0         0         0            0           beta.kubernetes.io/arch=arm       10m
kube-system   daemonset.apps/kube-flannel-ds-arm64     0         0         0         0            0           beta.kubernetes.io/arch=arm64     10m
kube-system   daemonset.apps/kube-flannel-ds-ppc64le   0         0         0         0            0           beta.kubernetes.io/arch=ppc64le   10m
kube-system   daemonset.apps/kube-flannel-ds-s390x     0         0         0         0            0           beta.kubernetes.io/arch=s390x     10m

NAMESPACE     NAME                      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
kube-system   deployment.apps/coredns   2         2         2            2           39m

NAMESPACE     NAME                                 DESIRED   CURRENT   READY     AGE
kube-system   replicaset.apps/coredns-55f86bf584   2         2         2         39m

Использовал этот манифест для сердечниковгде я изменил «Сеть»: «10.150.0.0/16» на «Сеть»: «10.150.0.0/16"

$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

И это для coreDNS

kubectl apply -f https://storage.googleapis.com/kubernetes-the-hard-way/coredns.yaml

Не уверенпочему я вижу жалобы на x.509 в журналах из соответствующих модулей:

$ kubectl logs kube-flannel-ds-amd64-kw972 -n kube-system
I1126 14:51:38.415251       1 main.go:475] Determining IP address of default interface
I1126 14:51:38.417393       1 main.go:488] Using interface with name enp0s3 and address 192.168.1.111
I1126 14:51:38.417535       1 main.go:505] Defaulting external address to interface address (192.168.1.111)
E1126 14:51:38.427865       1 main.go:232] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-amd64-kw972': Get https://10.32.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-amd64-kw972: x509: certificate is valid for 192.168.1.111, 127.0.0.1, not 10.32.0.1

$ kubectl logs coredns-55f86bf584-z4nvv -n kube-system
E1126 14:50:51.845470       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.32.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: x509: certificate is valid for 192.168.1.111, 127.0.0.1, not 10.32.0.1
E1126 14:50:51.850446       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.32.0.1:443/api/v1/services?limit=500&resourceVersion=0: x509: certificate is valid for 192.168.1.111, 127.0.0.1, not 10.32.0.1

Здесь 192.168.1.111 - мой главный узел, а 10.32.0.1 - ip. службы kubernetes.

Я не использовалkubeadm, чтобы поднять этот кластер.Выполнил большую часть начальной загрузки, выполнив https://github.com/kelseyhightower/kubernetes-the-hard-way

Также не уверен, правильно ли настроен SNAT:

$ sudo conntrack -L -d 10.32.0.1
tcp      6 17 TIME_WAIT src=192.168.1.111 dst=10.32.0.1 sport=37862 dport=443 src=192.168.1.111 dst=192.168.1.111 sport=6443 dport=37862 [ASSURED] mark=0 use=1
conntrack v1.4.4 (conntrack-tools): 1 flow entries have been shown.
$ sudo iptables -t nat -L KUBE-SERVICES
Chain KUBE-SERVICES (2 references)
target     prot opt source               destination         
KUBE-MARK-MASQ  udp  -- !10.150.0.0/16        10.32.0.10           /* kube-system/kube-dns:dns cluster IP */ udp dpt:domain
KUBE-SVC-TCOU7JCQXEZGVUNU  udp  --  anywhere             10.32.0.10           /* kube-system/kube-dns:dns cluster IP */ udp dpt:domain
KUBE-MARK-MASQ  tcp  -- !10.150.0.0/16        10.32.0.10           /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:domain
KUBE-SVC-ERIFXISQEP7F7OF4  tcp  --  anywhere             10.32.0.10           /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:domain
KUBE-MARK-MASQ  tcp  -- !10.150.0.0/16        10.32.0.1            /* default/kubernetes:https cluster IP */ tcp dpt:https
KUBE-SVC-NPX46M4PTMTKRN6Y  tcp  --  anywhere             10.32.0.1            /* default/kubernetes:https cluster IP */ tcp dpt:https
KUBE-NODEPORTS  all  --  anywhere             anywhere             /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL

Фланельная конфигурация:

$ cat /etc/sysconfig/flanneld 
# Flanneld configuration options  

# etcd url location.  Point this to the server where etcd runs
FLANNEL_ETCD_ENDPOINTS="https://192.168.1.111:2379"

# etcd config key.  This is the configuration key that flannel queries
# For address range assignment
FLANNEL_ETCD_PREFIX="/atomic.io/network"

# Any additional options that you want to pass
FLANNEL_OPTIONS="-v=9 --etcd-certfile=/var/lib/kubernetes/kubernetes.pem --etcd-keyfile=/var/lib/kubernetes/kubernetes-key.pem --remote-cafile=/var/lib/kubernetes/ca.pem"

Edit1: Обновлен заголовок, чтобы лучше отражать основную проблему.Моя цель - убедиться, что DNS работает должным образом в моей экосистеме k8s.Протестировано nslookup с изображением busybox 1.28.

$ kubectl exec -ti busybox -- nslookup kubernetes
Server:    10.32.0.10
Address 1: 10.32.0.10

nslookup: can't resolve 'kubernetes'
command terminated with exit code 1

Обновление: Ошибка x509 исчезла, и coredns запущена после обновления docker до 18.06.1-ce и редактирования файла kubelet.service для использования: --container-runtime-endpoint=unix:///var/run/docker/containerd/docker-containerd.sock На шаг ближе, но пока нет.

$ kubectl get pods --all-namespaces
NAMESPACE     NAME                          READY     STATUS        RESTARTS   AGE
default       busybox                       1/1       Terminating   1          1h
kube-system   coredns-55f86bf584-n84nw      1/1       Running       0          10m
kube-system   coredns-55f86bf584-zl88b      1/1       Running       0          10m
$ kubectl logs coredns-55f86bf584-n84nw -n kube-system
.:53
2018/11/26 18:49:48 [INFO] CoreDNS-1.2.2
2018/11/26 18:49:48 [INFO] linux/amd64, go1.11, eb51e8b
CoreDNS-1.2.2
linux/amd64, go1.11, eb51e8b
2018/11/26 18:49:48 [INFO] plugin/reload: Running configuration MD5 = 2e2180a5eeb3ebf92a5100ab081a6381
...