Is there a way to determine to which etcd host the kubernetes apiserver is talking to?

343 views Asked by At

Only apiserver talks directly to etcd. In the etcd cluster there are many hosts. I would like to see to which etcd host the apiserver is talking to. This may be different for each api resource like Pod or Node. I prefer to see etcd host information for each request.

Specifically, kubernetes 1.6.13 and etcd 3.1.14 using v3 store.

I have tried:

  1. Enable etcd client and grpc logging on the kubernetnes api server.

    I think grpc only logs in unexpected events. Similarly for etcd clientv3. I was not able to get information about the etcd side of the connection.

  2. Enable http2 debug logging with GODEBUG=http2debug=2 on api server

    To my surprise http2 debug logs print a lot of information about each request but I could not find the remote endpoint information. I am still skeptical about this I may be missing a mention in the log files. Not completely sure.

  3. Debug logs on the etcd side.

    Enabling debug logs with Enabling Debug Logging prints only about v2 store accesses. For v3 store one could use the http://<host>2379/debug/requests endpoint but that is not available in my version of etcd 3.1.14.

  4. I have not tried yet to use GODEBUG=http2debug=2 on the etcd side. Maybe the http2 logs on the etcd have the info I need.

  5. tcpdump or tcpflow

    The apiserver <-> etcd connection is encrypted. Would these show me the request url ? I think I did not see that information in the dumps.

  6. Man in the middle attack the apiserver <-> etcd connection with mitmproxy. I do not think this should be that complicated.

I hope, I have missed a super obvious and simple way to accomplish this.


Update:

About using lsof based approaches:

Using lsof, we can list the connections with endpoints information at one time. I do not think there is enough information in lsof output to arrive at endpoint information per request. Apiserver opens a lot of connections to etcd. Looking at the code that observation looks reasonable to me. See NewStorage in here

$ sudo lsof -p 20816 | grep :2379  | wc -l
130

The connections looks like this

$ sudo lsof -

p 20816 | grep :2379  | head -n 5
hyperkube 20816 root    3u     IPv4 58093240        0t0        TCP compute-master7001.dsv31.boxdc.net:36360->compute-etcd7001.dsv31.boxdc.net:2379 (ESTABLISHED)
hyperkube 20816 root    5u     IPv4 58085987        0t0        TCP compute-master7001.dsv31.boxdc.net:26005->compute-etcd7002.dsv31.boxdc.net:2379 (ESTABLISHED)
hyperkube 20816 root    6u     IPv4 58085988        0t0        TCP compute-master7001.dsv31.boxdc.net:55650->compute-etcd7003.dsv31.boxdc.net:2379 (ESTABLISHED)
hyperkube 20816 root    7u     IPv4 58102030        0t0        TCP compute-master7001.dsv31.boxdc.net:36366->compute-etcd7001.dsv31.boxdc.net:2379 (ESTABLISHED)
hyperkube 20816 root    8u     IPv4 58085990        0t0        TCP compute-master7001.dsv31.boxdc.net:55654->compute-etcd7003.dsv31.boxdc.net:2379 (ESTABLISHED)
........

Looking at this, I cannot know which etcd is used for each request between the apiserver and etcd.


Update:

I think at the etcdv3 client code that ships with kubernetes 1.6.13, the grpc.Balancer.Get function returns the endpoint address used for each grpc request. I think one could add a log print here and make apiserver log the etcd address per request.

1

There are 1 answers

1
user607473 On

Find the pid of apiserver

ps aux | grep apiserver

Then use lsof to see the open socket connections

lsof -p $PID | grep :2379