как можно загрузить данные MySQL в друида - PullRequest
0 голосов
/ 05 июля 2018

Из-за потребностей бизнеса мне нужно использовать druid, но druid не может загружать в него данные mysql. Для начала используйте confluent (kafka-connect-jdbc), загружайте данные mysql в kafka, в последний раз загружайте данные в druid. URL демо: (https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-1/) Данные Кафки:

 x@dataserv:~/confluent-4.1.1$ bin/kafka-avro-console-consumer --bootstrap-server localhost:9092 --property schema.registry.url=http://localhost:8081 --property print.key=true --from-beginning --topic mysql_foobar
          null    {"c1":{"int":1},"c2":{"string":"foo"},"create_ts":1530778230000,"update_ts":1530778230000}
          null    {"c1":{"int":2},"c2":{"string":"foo"},"create_ts":1530778309000,"update_ts":1530778309000}
          null    {"c1":{"int":3},"c2":{"string":"foo1"},"create_ts":1530778675000,"update_ts":1530778675000}

Затем используйте tranquility-distribution-0.8.2, загружающий данные kafka в druid, но это неудача, попробуйте еще раз и измените файл json (т. Е. Самый простой способ, не выполняющий операций агрегации), но все равно не получится. журнал:

2018-07-05 08:18:04,679 [KafkaConsumer-0] WARN  io.druid.segment.indexing.DataSchema - No metricsSpec has been specified. Are you sure this is what you want?
2018-07-05 08:18:05,427 [KafkaConsumer-0] WARN  io.druid.segment.indexing.DataSchema - No metricsSpec has been specified. Are you sure this is what you want?
2018-07-05 08:18:05,441 [KafkaConsumer-0] WARN  io.druid.segment.indexing.DataSchema - No metricsSpec has been specified. Are you sure this is what you want?
2018-07-05 08:18:05,555 [KafkaConsumer-0] INFO  c.metamx.emitter.core.LoggingEmitter - Start: started [true]
2018-07-05 08:18:10,491 [KafkaConsumer-0] WARN  io.druid.segment.indexing.DataSchema - No metricsSpec has been specified. Are you sure this is what you want?
2018-07-05 08:18:10,635 [KafkaConsumer-CommitThread] INFO  c.m.tranquility.kafka.KafkaConsumer - Flushed {mysql_foobar={receivedCount=1, sentCount=0, droppedCount=0, unparseableCount=1}} pending messages in 5ms and committed offsets in 25ms.
2018-07-05 08:18:25,637 [KafkaConsumer-CommitThread] INFO  c.m.tranquility.kafka.KafkaConsumer - Flushed {mysql_foobar={receivedCount=0, sentCount=0, droppedCount=0, unparseableCount=0}} pending messages in 0ms and committed offsets in 0ms.
2018-07-05 08:18:40,639 [KafkaConsumer-CommitThread] INFO  c.m.tranquility.kafka.KafkaConsumer - Flushed {mysql_foobar={receivedCount=0, sentCount=0, droppedCount=0, unparseableCount=0}} pending messages in 0ms and committed offsets in 0ms.
2018-07-05 08:18:55,640 [KafkaConsumer-CommitThread] INFO  c.m.tranquility.kafka.KafkaConsumer - Flushed {mysql_foobar={receivedCount=0, sentCount=0, droppedCount=0, unparseableCount=0}} pending messages in 0ms and committed offsets in 1ms.
2018-07-05 08:19:10,642 [KafkaConsumer-CommitThread] INFO  c.m.tranquility.kafka.KafkaConsumer - Flushed {mysql_foobar={receivedCount=0, sentCount=0, droppedCount=0, unparseableCount=0}} pending messages in 0ms and committed offsets in 0ms.
2018-07-05 08:19:25,644 [KafkaConsumer-CommitThread] INFO  c.m.tranquility.kafka.KafkaConsumer - Flushed {mysql_foobar={receivedCount=0, sentCount=0, droppedCount=0, unparseableCount=0}} pending messages in 0ms and committed offsets in 0ms.

Не могли бы вы помочь мне решить это? Я не могу решить это, и учиться две недели.

версия: друид-0.12.1 JSON-файл:

{
  "dataSources" : {
    "kafka" : {
      "spec" : {
        "dataSchema" : {
          "dataSource" : "kafka",
          "parser" : {
            "type" : "string",
            "parseSpec" : {
              "timestampSpec" : {
                "column" : "update_ts",
                "format" : "auto"
              },
              "dimensionsSpec" : {
                "dimensions" : ["c1","c2","create_ts"],
                "dimensionExclusions" : [ ]
              },
              "format" : "json"
            }
          },
          "granularitySpec" : {
            "type" : "uniform",
            "segmentGranularity" : "hour",
            "queryGranularity" : "none"
          },
          "metricsSpec" : []
        },
        "ioConfig" : {
          "type" : "realtime"
        },
        "tuningConfig" : {
          "type" : "realtime",
          "maxRowsInMemory" : "100000",
          "intermediatePersistPeriod" : "PT1M",
          "windowPeriod" : "PT2M"
        }
      },
      "properties" : {
        "task.partitions" : "1",
        "task.replicants" : "1",
        "topicPattern" : "mysql_foobar"
      }
    }
  },
  "properties" : {
    "zookeeper.connect" : "192.168.6.231:2181",
    "druid.discovery.curator.path" : "/druid/discovery",
    "druid.selectors.indexing.serviceName" : "druid/overlord",
    "commit.periodMillis" : "15000",
    "consumer.numThreads" : "2",
    "kafka.zookeeper.connect" : "192.168.6.231:2182",
    "kafka.group.id" : "tranquility-kafka"
  }
}
...