serf: предупреждение о глубине очереди запросов, бесконечный отсчет, нет документации об источнике ошибки - PullRequest
0 голосов
/ 14 февраля 2019

Со вчерашнего дня мы начали получать эту ошибку в промежутках между запусками заданий, которая бесконечно ведет к увеличению (до 1245 до ручного перезапуска)

Мы запускаем планировщик заданий Dkron с ETCD в качестве бэкэнда.

Пример ошибки:

INFO [2019-02-14T10: 03: 39 + 02: 00] 2019/02/14 10:03:39 [WARN] serf: Queryглубина очереди: 218

Кажется, что нет никакой документации или помощи, чтобы понять это, я был в состоянии найти.Если кто-то знает, что является причиной этого, или какую-либо дополнительную информацию о нем, я был бы очень признателен!

Ниже я добавлю копию журналов более подробно на случай, если что-то из этого может помочь.Еще раз большое спасибо!(и извинения, если я не предоставил достаточно информации, я не уверен, какая информация будет важной, поскольку я не знаю источника этой ошибки)

INFO[2019-02-14T10:03:49+02:00] 2019/02/14 10:03:49 [WARN] serf: Query queue depth: 218 
INFO[2019-02-14T10:03:50+02:00] 2019/02/14 10:03:50 [WARN] serf: Query queue depth: 218 
INFO[2019-02-14T10:03:51+02:00] 2019/02/14 10:03:51 [WARN] serf: Query queue depth: 218 
INFO[2019-02-14T10:03:52+02:00] 2019/02/14 10:03:52 [WARN] serf: Query queue depth: 218 
INFO[2019-02-14T10:03:53+02:00] 2019/02/14 10:03:53 [WARN] serf: Query queue depth: 218 
INFO[2019-02-14T10:03:54+02:00] 2019/02/14 10:03:54 [WARN] serf: Query queue depth: 218 
INFO[2019-02-14T10:03:55+02:00] 2019/02/14 10:03:55 [WARN] serf: Query queue depth: 218 
INFO[2019-02-14T10:03:56+02:00] 2019/02/14 10:03:56 [WARN] serf: Query queue depth: 218 
INFO[2019-02-14T10:03:57+02:00] 2019/02/14 10:03:57 [WARN] serf: Query queue depth: 218 
INFO[2019-02-14T10:03:58+02:00] 2019/02/14 10:03:58 [WARN] serf: Query queue depth: 218 
INFO[2019-02-14T10:03:59+02:00] 2019/02/14 10:03:59 [WARN] serf: Query queue depth: 218 
DEBU[2019-02-14T10:04:00+02:00] scheduler: Run job                            job=60minute_jobs node=Dkron_Main schedule="0 4 * * * *"
DEBU[2019-02-14T10:04:00+02:00] scheduler: Run job                            job=2minute_jobs node=Dkron_Main schedule="0 */2 * * * *"
DEBU[2019-02-14T10:04:00+02:00] store: Retrieved job from datastore           job=2minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] agent: Filtered nodes to run: []              node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] agent: Filtered tags to run: map[]            node=Dkron_Main
INFO[2019-02-14T10:04:00+02:00] agent: Sending query                          job_name=2minute_jobs node=Dkron_Main query="run:job"
DEBU[2019-02-14T10:04:00+02:00] agent: Sending query                          job_name=2minute_jobs json="{\"execution\":{\"job_name\":\"2minute_jobs\",\"started_at\":\"0001-01-01T00:00:00Z\",\"finished_at\":\"0001-01-01T00:00:00Z\",\"group\":1550131440007370602,\"attempt\":1},\"rpc_addr\":\"127.0.0.1:6868\"}" node=Dkron_Main query="run:job"
INFO[2019-02-14T10:04:00+02:00] agent: Received event                         event="query: run:job" node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] agent: Running job                            at=219 node=Dkron_Main payload="{\"execution\":{\"job_name\":\"2minute_jobs\",\"started_at\":\"0001-01-01T00:00:00Z\",\"finished_at\":\"0001-01-01T00:00:00Z\",\"group\":1550131440007370602,\"attempt\":1},\"rpc_addr\":\"127.0.0.1:6868\"}" query="run:job"
INFO[2019-02-14T10:04:00+02:00] agent: Starting job                           job=2minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] scheduler: Run job                            job=1minute_jobs node=Dkron_Main schedule="0 * * * * *"
INFO[2019-02-14T10:04:00+02:00] 2019/02/14 10:04:00 [DEBUG] serf: messageQueryResponseType: Dkron_Main 
DEBU[2019-02-14T10:04:00+02:00] agent: Received ack                           from=Dkron_Main node=Dkron_Main query="run:job"
DEBU[2019-02-14T10:04:00+02:00] store: Retrieved job from datastore           job=60minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] agent: Filtered nodes to run: []              node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] agent: Filtered tags to run: map[]            node=Dkron_Main
INFO[2019-02-14T10:04:00+02:00] agent: Sending query                          job_name=60minute_jobs node=Dkron_Main query="run:job"
DEBU[2019-02-14T10:04:00+02:00] agent: Sending query                          job_name=60minute_jobs json="{\"execution\":{\"job_name\":\"60minute_jobs\",\"started_at\":\"0001-01-01T00:00:00Z\",\"finished_at\":\"0001-01-01T00:00:00Z\",\"group\":1550131440007318183,\"attempt\":1},\"rpc_addr\":\"127.0.0.1:6868\"}" node=Dkron_Main query="run:job"
INFO[2019-02-14T10:04:00+02:00] 2019/02/14 10:04:00 [DEBUG] serf: messageQueryResponseType: Dkron_Main 
DEBU[2019-02-14T10:04:00+02:00] agent: Received ack                           from=Dkron_Main node=Dkron_Main query="run:job"
DEBU[2019-02-14T10:04:00+02:00] store: Retrieved job from datastore           job=1minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] agent: Filtered nodes to run: []              node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] agent: Filtered tags to run: map[]            node=Dkron_Main
INFO[2019-02-14T10:04:00+02:00] agent: Sending query                          job_name=1minute_jobs node=Dkron_Main query="run:job"
DEBU[2019-02-14T10:04:00+02:00] agent: Sending query                          job_name=1minute_jobs json="{\"execution\":{\"job_name\":\"1minute_jobs\",\"started_at\":\"0001-01-01T00:00:00Z\",\"finished_at\":\"0001-01-01T00:00:00Z\",\"group\":1550131440027758329,\"attempt\":1},\"rpc_addr\":\"127.0.0.1:6868\"}" node=Dkron_Main query="run:job"
INFO[2019-02-14T10:04:00+02:00] 2019/02/14 10:04:00 [DEBUG] serf: messageQueryResponseType: Dkron_Main 
DEBU[2019-02-14T10:04:00+02:00] agent: Received ack                           from=Dkron_Main node=Dkron_Main query="run:job"
DEBU[2019-02-14T10:04:00+02:00] grpc: Received GetJob                         job=2minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] store: Retrieved job from datastore           job=2minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] agent: GetJob by RPC                          job=2minute_jobs node=Dkron_Main
INFO[2019-02-14T10:04:00+02:00] agent: Received event                         event="query: run:job" node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] agent: Running job                            at=220 node=Dkron_Main payload="{\"execution\":{\"job_name\":\"60minute_jobs\",\"started_at\":\"0001-01-01T00:00:00Z\",\"finished_at\":\"0001-01-01T00:00:00Z\",\"group\":1550131440007318183,\"attempt\":1},\"rpc_addr\":\"127.0.0.1:6868\"}" query="run:job"
INFO[2019-02-14T10:04:00+02:00] agent: Starting job                           job=60minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] invoke: calling executor plugin               node=Dkron_Main plugin=shell
INFO[2019-02-14T10:04:00+02:00] 2019/02/14 10:04:00 [DEBUG] serf: messageQueryResponseType: Dkron_Main 
DEBU[2019-02-14T10:04:00+02:00] agent: Received response                      from=Dkron_Main node=Dkron_Main query="run:job" response="{\"job_name\":\"2minute_jobs\",\"started_at\":\"2019-02-14T10:04:00.126925316+02:00\",\"finished_at\":\"0001-01-01T00:00:00Z\",\"node_name\":\"Dkron_Main\",\"group\":1550131440007370602,\"attempt\":1}"
DEBU[2019-02-14T10:04:00+02:00] store: Setting key                            execution=1550131440126925316-Dkron_Main job=2minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] shell: going to run bash /home/benemenadmin/dkron_jobs/bash/2minute_jobs.sh  node=Dkron_Main subsystem_name=plugins.dkron-executor-shell
DEBU[2019-02-14T10:04:00+02:00] store: to detele key                          execution=1550119440072343171-Dkron_Main job=2minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] grpc: Received GetJob                         job=60minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] store: Retrieved job from datastore           job=60minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] agent: GetJob by RPC                          job=60minute_jobs node=Dkron_Main
INFO[2019-02-14T10:04:00+02:00] agent: Received event                         event="query: run:job" node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] agent: Running job                            at=221 node=Dkron_Main payload="{\"execution\":{\"job_name\":\"1minute_jobs\",\"started_at\":\"0001-01-01T00:00:00Z\",\"finished_at\":\"0001-01-01T00:00:00Z\",\"group\":1550131440027758329,\"attempt\":1},\"rpc_addr\":\"127.0.0.1:6868\"}" query="run:job"
INFO[2019-02-14T10:04:00+02:00] agent: Starting job                           job=1minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] invoke: calling executor plugin               node=Dkron_Main plugin=shell
INFO[2019-02-14T10:04:00+02:00] 2019/02/14 10:04:00 [DEBUG] serf: messageQueryResponseType: Dkron_Main 
DEBU[2019-02-14T10:04:00+02:00] agent: Received response                      from=Dkron_Main node=Dkron_Main query="run:job" response="{\"job_name\":\"60minute_jobs\",\"started_at\":\"2019-02-14T10:04:00.202107834+02:00\",\"finished_at\":\"0001-01-01T00:00:00Z\",\"node_name\":\"Dkron_Main\",\"group\":1550131440007318183,\"attempt\":1}"
DEBU[2019-02-14T10:04:00+02:00] store: Setting key                            execution=1550131440202107834-Dkron_Main job=60minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] shell: going to run bash /home/benemenadmin/dkron_jobs/bash/60minute_jobs.sh  node=Dkron_Main subsystem_name=plugins.dkron-executor-shell
DEBU[2019-02-14T10:04:00+02:00] store: to detele key                          execution=1549757040048767013-Dkron_Main job=60minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] grpc: Received GetJob                         job=1minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] store: Retrieved job from datastore           job=1minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] agent: GetJob by RPC                          job=1minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] invoke: calling executor plugin               node=Dkron_Main plugin=shell
DEBU[2019-02-14T10:04:00+02:00] shell: going to run bash /home/benemenadmin/dkron_jobs/bash/1minute_jobs.sh  node=Dkron_Main subsystem_name=plugins.dkron-executor-shell
INFO[2019-02-14T10:04:00+02:00] 2019/02/14 10:04:00 [DEBUG] serf: messageQueryResponseType: Dkron_Main 
DEBU[2019-02-14T10:04:00+02:00] agent: Received response                      from=Dkron_Main node=Dkron_Main query="run:job" response="{\"job_name\":\"1minute_jobs\",\"started_at\":\"2019-02-14T10:04:00.316814676+02:00\",\"finished_at\":\"0001-01-01T00:00:00Z\",\"node_name\":\"Dkron_Main\",\"group\":1550131440027758329,\"attempt\":1}"
DEBU[2019-02-14T10:04:00+02:00] store: Setting key                            execution=1550131440316814676-Dkron_Main job=1minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:00+02:00] store: to detele key                          execution=1550125380180647982-Dkron_Main job=1minute_jobs node=Dkron_Main
INFO[2019-02-14T10:04:00+02:00] 2019/02/14 10:04:00 [WARN] serf: Query queue depth: 221 
INFO[2019-02-14T10:04:01+02:00] 2019/02/14 10:04:01 [WARN] serf: Query queue depth: 221 
INFO[2019-02-14T10:04:02+02:00] 2019/02/14 10:04:02 [WARN] serf: Query queue depth: 221 
DEBU[2019-02-14T10:04:03+02:00] agent: Done receiving acks and responses      node=Dkron_Main query="run:job"
DEBU[2019-02-14T10:04:03+02:00] agent: Done receiving acks and responses      node=Dkron_Main query="run:job"
DEBU[2019-02-14T10:04:03+02:00] agent: Done receiving acks and responses      node=Dkron_Main query="run:job"
INFO[2019-02-14T10:04:03+02:00] 2019/02/14 10:04:03 [WARN] serf: Query queue depth: 221 
INFO[2019-02-14T10:04:04+02:00] 2019/02/14 10:04:04 [WARN] serf: Query queue depth: 221 
INFO[2019-02-14T10:04:05+02:00] 2019/02/14 10:04:05 [WARN] serf: Query queue depth: 221 
INFO[2019-02-14T10:04:06+02:00] 2019/02/14 10:04:06 [WARN] serf: Query queue depth: 221 
INFO[2019-02-14T10:04:07+02:00] 2019/02/14 10:04:07 [WARN] serf: Query queue depth: 221 
DEBU[2019-02-14T10:04:07+02:00] shell: Command output                         node=Dkron_Main subsystem_name=plugins.dkron-executor-shell
DEBU[2019-02-14T10:04:07+02:00]                                               node=Dkron_Main subsystem_name=plugins.dkron-executor-shell
DEBU[2019-02-14T10:04:07+02:00] 2Minute_Jobs complete                         node=Dkron_Main subsystem_name=plugins.dkron-executor-shell
DEBU[2019-02-14T10:04:07+02:00]                                               node=Dkron_Main subsystem_name=plugins.dkron-executor-shell
DEBU[2019-02-14T10:04:07+02:00] grpc: Received execution done                 group=1550131440007370602 job=2minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:07+02:00] store: Retrieved job from datastore           job=2minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:07+02:00] store: Setting key                            execution=1550131440126925316-Dkron_Main job=2minute_jobs node=Dkron_Main
DEBU[2019-02-14T10:04:08+02:00] grpc: from: Dkron_Main                        node=Dkron_Main
INFO[2019-02-14T10:04:08+02:00] 2019/02/14 10:04:08 [WARN] serf: Query queue depth: 221 
INFO[2019-02-14T10:04:09+02:00] 2019/02/14 10:04:09 [WARN] serf: Query queue depth: 221 
INFO[2019-02-14T10:04:10+02:00] 2019/02/14 10:04:10 [WARN] serf: Query queue depth: 221 
INFO[2019-02-14T10:04:11+02:00] 2019/02/14 10:04:11 [WARN] serf: Query queue depth: 221 
INFO[2019-02-14T10:04:12+02:00] 2019/02/14 10:04:12 [WARN] serf: Query queue depth: 221 
INFO[2019-02-14T10:04:13+02:00] 2019/02/14 10:04:13 [WARN] serf: Query queue depth: 221 
INFO[2019-02-14T10:04:14+02:00] 2019/02/14 10:04:14 [WARN] serf: Query queue depth: 221 
INFO[2019-02-14T10:04:15+02:00] 2019/02/14 10:04:15 [WARN] serf: Query queue depth: 221 
INFO[2019-02-14T10:04:16+02:00] 2019/02/14 10:04:16 [WARN] serf: Query queue depth: 221 
INFO[2019-02-14T10:04:17+02:00] 2019/02/14 10:04:17 [WARN] serf: Query queue depth: 221 
...