Fluentd парсинг частичный json log - PullRequest
0 голосов
/ 24 апреля 2020

У меня есть следующие журналы от Apache Друид

{"timeMillis":1587714600921,"thread":"qtp150208546-149","level":"INFO","loggerName":"org.apache.druid.server.log.LoggingRequestLogger","message":"2020-04-24T07:50:00.798Z\t10.2.73.23\t{\"queryType\":\"segmentMetadata\",\"dataSource\":{\"type\":\"table\",\"name\":\"tableB\"},\"intervals\":{\"type\":\"LegacySegmentSpec\",\"intervals\":[\"1999-01-01T00:00:00.000Z/2114-01-01T00:00:00.000Z\"]},\"toInclude\":{\"type\":\"all\"},\"merge\":true,\"context\":{\"queryId\":\"ed1e7129-1e3f-438d-acbb-04f11d292eb5\"},\"analysisTypes\":[\"aggregators\"],\"usingDefaultInterval\":false,\"lenientAggregatorMerge\":false,\"descending\":false,\"granularity\":{\"type\":\"all\"}}\t{\"query/time\":122,\"query/bytes\":4712,\"success\":true,\"identity\":\"allowAll\"}","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger"}
{"timeMillis":1587714600952,"thread":"qtp150208546-119","level":"INFO","loggerName":"org.apache.druid.server.log.LoggingRequestLogger","message":"2020-04-24T07:50:00.941Z\t10.2.73.23\t{\"queryType\":\"segmentMetadata\",\"dataSource\":{\"type\":\"table\",\"name\":\"test\"},\"intervals\":{\"type\":\"LegacySegmentSpec\",\"intervals\":[\"1999-01-01T00:00:00.000Z/2114-01-01T00:00:00.000Z\"]},\"toInclude\":{\"type\":\"all\"},\"merge\":true,\"context\":{\"queryId\":\"6368e8c4-e18d-4a29-97cc-df2e0aadd02e\"},\"analysisTypes\":[\"aggregators\"],\"usingDefaultInterval\":false,\"lenientAggregatorMerge\":false,\"descending\":false,\"granularity\":{\"type\":\"all\"}}\t{\"query/time\":10,\"query/bytes\":4710,\"success\":true,\"identity\":\"allowAll\"}","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger"}
{"timeMillis":1587714662763,"thread":"qtp150208546-131","level":"INFO","loggerName":"org.apache.druid.server.log.LoggingRequestLogger","message":"2020-04-24T07:51:02.694Z\t\t{\"queryType\":\"topN\",\"dataSource\":{\"type\":\"table\",\"name\":\"tableA\"},\"virtualColumns\":[],\"dimension\":{\"type\":\"default\",\"dimension\":\"key\",\"outputName\":\"d0\",\"outputType\":\"STRING\"},\"metric\":{\"type\":\"numeric\",\"metric\":\"a0\"},\"threshold\":100,\"intervals\":{\"type\":\"intervals\",\"intervals\":[\"2020-03-05T07:51:02.000Z/146140482-04-24T15:36:27.903Z\"]},\"filter\":null,\"granularity\":{\"type\":\"all\"},\"aggregations\":[{\"type\":\"count\",\"name\":\"a0\"}],\"postAggregations\":[],\"context\":{\"queryId\":\"2084372b-f9ec-43c4-a5c7-bb6c28a738fc\",\"sqlQueryId\":\"e137d0ef-38ba-4af5-ac0e-5cdf86439ace\"},\"descending\":false}\t{\"query/time\":68,\"query/bytes\":-1,\"success\":true,\"identity\":\"allowAll\"}","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger"}
{"timeMillis":1587714662763,"thread":"qtp150208546-131","level":"INFO","loggerName":"org.apache.druid.server.log.LoggingRequestLogger","message":"2020-04-24T07:51:02.642Z\t10.2.64.24\t\t{\"sqlQuery/time\":121,\"sqlQuery/bytes\":2095,\"success\":true,\"context\":{\"sqlQueryId\":\"e137d0ef-38ba-4af5-ac0e-5cdf86439ace\",\"nativeQueryIds\":\"[2084372b-f9ec-43c4-a5c7-bb6c28a738fc]\"},\"identity\":\"allowAll\"}\t{\"query\":\"SELECT * FROM (SELECT\\n  \\\"key\\\",\\n  COUNT(*) AS \\\"Count\\\"\\nFROM \\\"tableA\\\"\\nWHERE \\\"__time\\\" >= CURRENT_TIMESTAMP - INTERVAL '50' DAY\\nGROUP BY 1\\nORDER BY \\\"Count\\\" DESC\\n) LIMIT 100\",\"context\":{\"sqlQueryId\":\"e137d0ef-38ba-4af5-ac0e-5cdf86439ace\",\"nativeQueryIds\":\"[2084372b-f9ec-43c4-a5c7-bb6c28a738fc]\"}}","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger"}

Я хотел бы проанализировать, используя JSON парсер ключа message, чтобы получить поля и значения для каждого извлеченного

"2020-04-24T07:50:00.798Z\t10.2.73.23\t{\"queryType\":\"segmentMetadata\",\"dataSource\":{\"type\":\"table\",\"name\":\"tableA\"},\"intervals\":{\"type\":\"LegacySegmentSpec\",\"intervals\":[\"1999-01-01T00:00:00.000Z/2114-01-01T00:00:00.000Z\"]},\"toInclude\":{\"type\":\"all\"},\"merge\":true,\"context\":{\"queryId\":\"ed1e7129-1e3f-438d-acbb-04f11d292eb5\"},\"analysisTypes\":[\"aggregators\"],\"usingDefaultInterval\":false,\"lenientAggregatorMerge\":false,\"descending\":false,\"granularity\":{\"type\":\"all\"}}\t{\"query/time\":122,\"query/bytes\":4712,\"success\":true,\"identity\":\"allowAll\"}"
"2020-04-24T07:50:00.941Z\t10.2.73.23\t{\"queryType\":\"segmentMetadata\",\"dataSource\":{\"type\":\"table\",\"name\":\"test\"},\"intervals\":{\"type\":\"LegacySegmentSpec\",\"intervals\":[\"1999-01-01T00:00:00.000Z/2114-01-01T00:00:00.000Z\"]},\"toInclude\":{\"type\":\"all\"},\"merge\":true,\"context\":{\"queryId\":\"6368e8c4-e18d-4a29-97cc-df2e0aadd02e\"},\"analysisTypes\":[\"aggregators\"],\"usingDefaultInterval\":false,\"lenientAggregatorMerge\":false,\"descending\":false,\"granularity\":{\"type\":\"all\"}}\t{\"query/time\":10,\"query/bytes\":4710,\"success\":true,\"identity\":\"allowAll\"}"
"2020-04-24T07:51:02.694Z\t\t{\"queryType\":\"topN\",\"dataSource\":{\"type\":\"table\",\"name\":\"tableA\"},\"virtualColumns\":[],\"dimension\":{\"type\":\"default\",\"dimension\":\"key\",\"outputName\":\"d0\",\"outputType\":\"STRING\"},\"metric\":{\"type\":\"numeric\",\"metric\":\"a0\"},\"threshold\":100,\"intervals\":{\"type\":\"intervals\",\"intervals\":[\"2020-03-05T07:51:02.000Z/146140482-04-24T15:36:27.903Z\"]},\"filter\":null,\"granularity\":{\"type\":\"all\"},\"aggregations\":[{\"type\":\"count\",\"name\":\"a0\"}],\"postAggregations\":[],\"context\":{\"queryId\":\"2084372b-f9ec-43c4-a5c7-bb6c28a738fc\",\"sqlQueryId\":\"e137d0ef-38ba-4af5-ac0e-5cdf86439ace\"},\"descending\":false}\t{\"query/time\":68,\"query/bytes\":-1,\"success\":true,\"identity\":\"allowAll\"}"
"2020-04-24T07:51:02.642Z\t10.2.64.24\t\t{\"sqlQuery/time\":121,\"sqlQuery/bytes\":2095,\"success\":true,\"context\":{\"sqlQueryId\":\"e137d0ef-38ba-4af5-ac0e-5cdf86439ace\",\"nativeQueryIds\":\"[2084372b-f9ec-43c4-a5c7-bb6c28a738fc]\"},\"identity\":\"allowAll\"}\t{\"query\":\"SELECT * FROM (SELECT\\n  \\\"key\\\",\\n  COUNT(*) AS \\\"Count\\\"\\nFROM \\\"tableA\\\"\\nWHERE \\\"__time\\\" >= CURRENT_TIMESTAMP - INTERVAL '50' DAY\\nGROUP BY 1\\nORDER BY \\\"Count\\\" DESC\\n) LIMIT 100\",\"context\":{\"sqlQueryId\":\"e137d0ef-38ba-4af5-ac0e-5cdf86439ace\",\"nativeQueryIds\":\"[2084372b-f9ec-43c4-a5c7-bb6c28a738fc]\"}}"

как вы видите, это не формат JSON. Но я хотел бы извлечь только часть JSON.
Версия Fluentd: v1.8.1
Я пробую эту конфигурацию Fluentd:

<filter kubernetes.var.log.containers.druid-brokers-**.log>
      @type parser
      key_name $["log"]["message"]
      reserve_data true
      remove_key_name_field true
      hash_value_field parsed
      <parse>
        @type json
      </parse>
</filter>

Но я не могу добраться до следующее сообщение об ошибке:

[warn]: #0 dump an error event: error_class=TypeError error="String does not have #dig method"

Спасибо за вашу помощь.

С уважением, Винсент

...