Кафка в потоковом режиме на кассандре, используя logstash - PullRequest
1 голос
/ 14 февраля 2020

Я пытаюсь выполнить потоковую передачу из kafka topi c в таблицу в Кассандре, у меня есть файл конфигурации в .... / logsta sh -7.6.0 / config, имя файла - pipe.conf и его содержимое выглядит следующим образом:

input {
    kafka {
            bootstrap_servers => "localhost:9092"
            topics => ["test"]
    }
}
output{
     cassandra {
        hosts => [ "127.0.0.1"]
        port => 9042
        protocol_version => 4
        consistency => 'any'
        keyspace => "task3"
        table => "%{[@metadata][logfile]}"
        hints => {
            host => "text"
            time => "timestamp"
            response_time => "int"
            status => "int"
            method => "text"
            protocol => "text"
            endpoint => "text"
            uuid => "uuid"
        }
        retry_policy => { "type" => "default" }
        request_timeout => 1
        ignore_bad_values => false
        flush_size => 500
        idle_flush_time => 1
}

После запуска следующей команды

./bin/logstash -f config/pipeline.conf

Я получаю это на терминале

Sending Logstash logs to /Users/ar-shreya.eswaraiah/Desktop/logstash-7.6.0/logs which is now configured via log4j2.properties
[2020-02-14T22:42:33,552][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-02-14T22:42:33,716][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.6.0"}
[2020-02-14T22:42:36,044][INFO ][org.reflections.Reflections] Reflections took 43 ms to scan 1 urls, producing 20 keys and 40 values 
[2020-02-14T22:42:37,461][INFO ][logstash.outputs.cassandraoutput][main] Establishing control connection
[2020-02-14T22:42:38,015][INFO ][logstash.outputs.cassandraoutput][main] Refreshing connected host's metadata
[2020-02-14T22:42:38,251][INFO ][logstash.outputs.cassandraoutput][main] Completed refreshing connected host's metadata
[2020-02-14T22:42:38,281][INFO ][logstash.outputs.cassandraoutput][main] Refreshing peers metadata
[2020-02-14T22:42:38,302][INFO ][logstash.outputs.cassandraoutput][main] Completed refreshing peers metadata
[2020-02-14T22:42:38,309][INFO ][logstash.outputs.cassandraoutput][main] Refreshing schema
[2020-02-14T22:42:39,524][INFO ][logstash.outputs.cassandraoutput][main] Schema refreshed
[2020-02-14T22:42:39,528][INFO ][logstash.outputs.cassandraoutput][main] Control connection established
[2020-02-14T22:42:39,614][INFO ][logstash.outputs.cassandraoutput][main] Creating session
[2020-02-14T22:42:39,701][INFO ][logstash.outputs.cassandraoutput][main] Session created
[2020-02-14T22:42:39,915][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.RubyArray) has been create for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
[2020-02-14T22:42:39,922][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["/Users/ar-shreya.eswaraiah/Desktop/logstash-7.6.0/config/pipeline.conf"], :thread=>"#<Thread:0xbcb14aa run>"}
[2020-02-14T22:42:40,849][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-02-14T22:42:40,946][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2020-02-14T22:42:40,987][INFO ][org.apache.kafka.clients.consumer.ConsumerConfig][main] ConsumerConfig values: 
    allow.auto.create.topics = true
    auto.commit.interval.ms = 5000
    auto.offset.reset = latest
    bootstrap.servers = [localhost:9092]
    check.crcs = true
    client.dns.lookup = default
    client.id = logstash-0
    client.rack = 
    connections.max.idle.ms = 540000
    default.api.timeout.ms = 60000
    enable.auto.commit = true
    exclude.internal.topics = true
    fetch.max.bytes = 52428800
    fetch.max.wait.ms = 500
    fetch.min.bytes = 1
    group.id = logstash
    group.instance.id = null
    heartbeat.interval.ms = 3000
    interceptor.classes = []
    internal.leave.group.on.close = true
    isolation.level = read_uncommitted
    key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
    max.partition.fetch.bytes = 1048576
    max.poll.interval.ms = 300000
    max.poll.records = 500
    metadata.max.age.ms = 300000
    metric.reporters = []
    metrics.num.samples = 2
    metrics.recording.level = INFO
    metrics.sample.window.ms = 30000
    partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
    receive.buffer.bytes = 65536
    reconnect.backoff.max.ms = 1000
    reconnect.backoff.ms = 50
    request.timeout.ms = 30000
    retry.backoff.ms = 100
    sasl.client.callback.handler.class = null
    sasl.jaas.config = null
    sasl.kerberos.kinit.cmd = /usr/bin/kinit
    sasl.kerberos.min.time.before.relogin = 60000
    sasl.kerberos.service.name = null
    sasl.kerberos.ticket.renew.jitter = 0.05
    sasl.kerberos.ticket.renew.window.factor = 0.8
    sasl.login.callback.handler.class = null
    sasl.login.class = null
    sasl.login.refresh.buffer.seconds = 300
    sasl.login.refresh.min.period.seconds = 60
    sasl.login.refresh.window.factor = 0.8
    sasl.login.refresh.window.jitter = 0.05
    sasl.mechanism = GSSAPI
    security.protocol = PLAINTEXT
    send.buffer.bytes = 131072
    session.timeout.ms = 10000
    ssl.cipher.suites = null
    ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
    ssl.endpoint.identification.algorithm = https
    ssl.key.password = null
    ssl.keymanager.algorithm = SunX509
    ssl.keystore.location = null
    ssl.keystore.password = null
    ssl.keystore.type = JKS
    ssl.protocol = TLS
    ssl.provider = null
    ssl.secure.random.implementation = null
    ssl.trustmanager.algorithm = PKIX
    ssl.truststore.location = null
    ssl.truststore.password = null
    ssl.truststore.type = JKS
    value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer

[2020-02-14T22:42:41,131][INFO ][org.apache.kafka.common.utils.AppInfoParser][main] Kafka version: 2.3.0
[2020-02-14T22:42:41,133][INFO ][org.apache.kafka.common.utils.AppInfoParser][main] Kafka commitId: fc1aaa116b661c8a
[2020-02-14T22:42:41,134][INFO ][org.apache.kafka.common.utils.AppInfoParser][main] Kafka startTimeMs: 1581700361128
[2020-02-14T22:42:41,151][INFO ][org.apache.kafka.clients.consumer.KafkaConsumer][main] [Consumer clientId=logstash-0, groupId=logstash] Subscribed to topic(s): test
[2020-02-14T22:42:41,423][INFO ][org.apache.kafka.clients.Metadata][main] [Consumer clientId=logstash-0, groupId=logstash] Cluster ID: 4lsF-DNLS7yRsthTUds7IQ
[2020-02-14T22:42:41,545][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2020-02-14T22:42:42,054][INFO ][org.apache.kafka.clients.consumer.internals.AbstractCoordinator][main] [Consumer clientId=logstash-0, groupId=logstash] Discovered group coordinator 192.168.0.103:9092 (id: 2147483647 rack: null)
[2020-02-14T22:42:42,061][INFO ][org.apache.kafka.clients.consumer.internals.ConsumerCoordinator][main] [Consumer clientId=logstash-0, groupId=logstash] Revoking previously assigned partitions []
[2020-02-14T22:42:42,063][INFO ][org.apache.kafka.clients.consumer.internals.AbstractCoordinator][main] [Consumer clientId=logstash-0, groupId=logstash] (Re-)joining group
[2020-02-14T22:42:42,093][INFO ][org.apache.kafka.clients.consumer.internals.AbstractCoordinator][main] [Consumer clientId=logstash-0, groupId=logstash] (Re-)joining group
[2020-02-14T22:42:42,146][INFO ][org.apache.kafka.clients.consumer.internals.AbstractCoordinator][main] [Consumer clientId=logstash-0, groupId=logstash] Successfully joined group with generation 1
[2020-02-14T22:42:42,155][INFO ][org.apache.kafka.clients.consumer.internals.ConsumerCoordinator][main] [Consumer clientId=logstash-0, groupId=logstash] Setting newly assigned partitions: test-0
[2020-02-14T22:42:42,173][INFO ][org.apache.kafka.clients.consumer.internals.ConsumerCoordinator][main] [Consumer clientId=logstash-0, groupId=logstash] Found no committed offset for partition test-0
[2020-02-14T22:42:42,193][INFO ][org.apache.kafka.clients.consumer.internals.SubscriptionState][main] [Consumer clientId=logstash-0, groupId=logstash] Resetting offset for partition test-0 to offset 0.

После это, он просто застрял здесь, нет никаких обновлений в таблице cassandra. Куда я иду не так? Это проблема установки плагина или конфигурации?

Спасибо.

1 Ответ

1 голос
/ 15 февраля 2020

Исходя из последних трех строк журнала, показанных со смещением 0 и auto.offset.reset = latest, ваш topi c, по-видимому, пуст.

Вам необходимо создать некоторые данные в test topi c, чтобы Logsta sh могла что-либо сделать.

...