Невозможно запустить REST Server для kafka-connect для распределенного режима - PullRequest
0 голосов
/ 30 января 2019

Я хочу запустить kafka connect в распределенном режиме в моей настройке с 3 узлами.Но TimeoutException происходит даже без размещения моего разъема.Даже если я отправлю конфигурацию соединителя до TimeoutException, он не достигнет конечной точки, и ответ Request timed out будет возвращен после остановки сервера REST.Я выполняю следующую команду на одном из моих узлов:
bin/connect-distributed etc/kafka/connect-distributed.properties
Журналы:

.
.
[2019-01-30 10:16:21,126] INFO Started o.e.j.s.ServletContextHandler@5fed9976{/,null,AVAILABLE} (org.eclipse.jetty.server.handler.ContextHandler:850)
[2019-01-30 10:16:21,134] INFO Started http_8083@15d3793b{HTTP/1.1,[http/1.1]}{0.0.0.0:8083} (org.eclipse.jetty.server.AbstractConnector:292)
[2019-01-30 10:16:21,135] INFO Started @18558ms (org.eclipse.jetty.server.Server:408)
[2019-01-30 10:16:21,136] INFO Advertised URI: http://127.0.1.1:8083/ (org.apache.kafka.connect.runtime.rest.RestServer:267)
[2019-01-30 10:16:21,136] INFO REST server listening at http://127.0.1.1:8083/, advertising URL http://127.0.1.1:8083/ (org.apache.kafka.connect.runtime.rest.RestServer:217)
[2019-01-30 10:16:21,136] INFO Kafka Connect started (org.apache.kafka.connect.runtime.Connect:55)
[2019-01-30 10:17:20,594] ERROR Uncaught exception in herder work thread, exiting:  (org.apache.kafka.connect.runtime.distributed.DistributedHerder:228)
org.apache.kafka.common.errors.TimeoutException: Timeout of 60000ms expired before the position for partition connect-offsets-41 could be determined
[2019-01-30 10:17:20,620] INFO Kafka Connect stopping (org.apache.kafka.connect.runtime.Connect:65)
[2019-01-30 10:17:20,620] INFO Stopping REST server (org.apache.kafka.connect.runtime.rest.RestServer:223)
.
.

connect-distributed.properties :

bootstrap.servers=localhost:9092
group.id=connect-cluster
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.storage.topic=connect-offsets
offset.storage.replication.factor=3
config.storage.topic=connect-configs
config.storage.replication.factor=3
status.storage.topic=connect-status
status.storage.replication.factor=3
offset.flush.interval.ms=10000
plugin.path=share/java,/root/confluent-5.1.0/share/confluent-hub-components

zookeeper.properties :

dataDir=/var/zookeeper
clientPort=2181
maxClientCnxns=0 
tickTime=2000
server.1=current:2888:3888 # 0.0.0.0
server.2=kfk-2:2888:3888 #public ip
server.3=kfk-3:2888:3888 #public ip
initLimit=20
syncLimit=10

server.properties :

broker.id=1 #for node-1
advertised.listeners=PLAINTEXT://<public-ip>:9092
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/var/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=kfkp-1:2181,kfkp-2:2181,kfkp-3:2181 #private ips
zookeeper.connection.timeout.ms=6000

Под вышеуказанным сервером иКонфигурации zookeeper Мне удалось использовать сообщения, опубликованные производителем по реплицированной теме, на всех узлах с уловкой, которую мне пришлось использовать --partition 0 в команде потребителя.Также в журналах сервера много сообщений WARN.

Что-то не так с конфигурациями?Почему сервер REST дает TimeoutException?


ОБНОВЛЕНИЕ :

описать "connect-status:

Topic:connect-status    PartitionCount:10   ReplicationFactor:3 Configs:cleanup.policy=compact  
Topic: connect-status   Partition: 0    Leader: 3   Replicas: 3,1,2 Isr: 3,2,1  
Topic: connect-status   Partition: 1    Leader: 3   Replicas: 1,2,3 Isr: 3,2,1  
Topic: connect-status   Partition: 2    Leader: 3   Replicas: 2,3,1 Isr: 3,2,1  
Topic: connect-status   Partition: 3    Leader: 3   Replicas: 3,2,1 Isr: 3,2,1  
Topic: connect-status   Partition: 4    Leader: 3   Replicas: 1,3,2 Isr: 3,2,1  
Topic: connect-status   Partition: 5    Leader: 3   Replicas: 2,1,3 Isr: 3,2,1  
Topic: connect-status   Partition: 6    Leader: 3   Replicas: 3,1,2 Isr: 3,2,1  
Topic: connect-status   Partition: 7    Leader: 3   Replicas: 1,2,3 Isr: 3,2,1  
Topic: connect-status   Partition: 8    Leader: 3   Replicas: 2,3,1 Isr: 3,2,1  
Topic: connect-status   Partition: 9    Leader: 3   Replicas: 3,2,1 Isr: 3,2,1  

описать" connect-config ":

Topic:connect-configs   PartitionCount:1    ReplicationFactor:3 Configs:cleanup.policy=compact
Topic: connect-configs  Partition: 0    Leader: 1   Replicas: 1,2,3 Isr: 3,2,1

опишите "смещения соединения":

Topic:connect-offsets   PartitionCount:50   ReplicationFactor:3 Configs:cleanup.policy=compact
Topic: connect-offsets  Partition: 0    Leader: 1   Replicas: 1,2,3 Isr: 3,2,1  
Topic: connect-offsets  Partition: 1    Leader: 2   Replicas: 2,3,1 Isr: 3,2,1  
Topic: connect-offsets  Partition: 2    Leader: 3   Replicas: 3,1,2 Isr: 3,2,1  
Topic: connect-offsets  Partition: 3    Leader: 1   Replicas: 1,3,2 Isr: 3,2,1  
Topic: connect-offsets  Partition: 4    Leader: 2   Replicas: 2,1,3 Isr: 3,2,1  
Topic: connect-offsets  Partition: 5    Leader: 3   Replicas: 3,2,1 Isr: 3,2,1  
Topic: connect-offsets  Partition: 6    Leader: 1   Replicas: 1,2,3 Isr: 3,2,1  
Topic: connect-offsets  Partition: 7    Leader: 2   Replicas: 2,3,1 Isr: 3,2,1  
.
.  
.
.
Topic: connect-offsets  Partition: 47   Leader: 3   Replicas: 3,2,1 Isr: 3,2,1  
Topic: connect-offsets  Partition: 48   Leader: 1   Replicas: 1,2,3 Isr: 3,2,1  
Topic: connect-offsets  Partition: 49   Leader: 2   Replicas: 2,3,1 Isr: 3,2,1

Некоторые журналы сервера с WARN Я не понял значения:

.  
.
.
[2019-01-31 04:21:24,720] WARN [ReplicaFetcher replicaId=2, leaderId=3, fetcherId=0] Based on replica's leader epoch, leader replied with an unknown offset in connect-offsets-26. The initial fetch offset 0 will be used for truncation. (kafka.server.ReplicaFetcherThread)
[2019-01-31 04:21:24,721] INFO [Log partition=connect-offsets-26, dir=/var/kafka-logs] Truncating to 0 has no effect as the largest offset in the log is -1 (kafka.log.Log)
[2019-01-31 04:21:24,721] INFO [Log partition=y-1, dir=/var/kafka-logs] Truncating to 3 has no effect as the largest offset in the log is 2 (kafka.log.Log)
[2019-01-31 04:21:24,721] WARN [ReplicaFetcher replicaId=2, leaderId=3, fetcherId=0] Based on replica's leader epoch, leader replied with an unknown offset in connect-status-2. The initial fetch offset 0 will be used for truncation. (kafka.server.ReplicaFetcherThread)
[2019-01-31 04:21:24,721] INFO [Log partition=connect-status-2, dir=/var/kafka-logs] Truncating to 0 has no effect as the largest offset in the log is -1 (kafka.log.Log)
[2019-01-31 04:21:24,721] WARN [ReplicaFetcher replicaId=2, leaderId=3, fetcherId=0] Based on replica's leader epoch, leader replied with an unknown offset in connect-offsets-49. The initial fetch offset 0 will be used for truncation. (kafka.server.ReplicaFetcherThread)
[2019-01-31 04:21:24,721] INFO [Log partition=connect-offsets-49, dir=/var/kafka-logs] Truncating to 0 has no effect as the largest offset in the log is -1 (kafka.log.Log)
[2019-01-31 04:21:24,722] WARN [ReplicaFetcher replicaId=2, leaderId=3, fetcherId=0] Based on replica's leader epoch, leader replied with an unknown offset in connect-offsets-47. The initial fetch offset 0 will be used for truncation. (kafka.server.ReplicaFetcherThread)
[2019-01-31 04:21:24,722] INFO [Log partition=connect-offsets-47, dir=/var/kafka-logs] Truncating to 0 has no effect as the largest offset in the log is -1 (kafka.log.Log)
.
.
.
.
[2019-01-31 04:21:34,371] WARN [Producer clientId=producer-1] Error while fetching metadata with correlation id 98 : {__confluent.support.metrics=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2019-01-31 04:21:34,476] WARN [Producer clientId=producer-1] Error while fetching metadata with correlation id 99 : {__confluent.support.metrics=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2019-01-31 04:21:34,492] INFO [Producer clientId=producer-1] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms. (org.apache.kafka.clients.producer.KafkaProducer)
[2019-01-31 04:21:34,502] ERROR Failed to submit metrics to Kafka topic __confluent.support.metrics (due to exception): java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 10000 ms. (io.confluent.support.metrics.submitters.KafkaSubmitter)
[2019-01-31 04:21:35,997] INFO Successfully submitted metrics to Confluent via secure endpoint (io.confluent.support.metrics.submitters.ConfluentSubmitter)
.
.
.
...